Machine Learning in Physics
I taught this class at the TU Wien in Spring 2021. It is an introduction to basic
machine learning techniques, with extensive formal justification, and applied to simple
problems from physics.
After completing the class, students are supposed to be able to take a physics problem and:
- analyze whether Machine Learning is applicable,
- translate the problem into a suitable optimization problem,
- decide which learning algorithms are suitable for its solution
- program simple routines themselves as well as use libraries where applicable
- understand and verify the results of the learning procedure
The lecture can be roughly divided into three parts:
- Python and Optimziation (exercises 1–3): Since physics undergrads at the
TU Wien learn coding in C++ and not Python, the class starts with a Python crash
course. As exercises that go together with this intro and also are relevant to ML,
we are looking at simple optimization problems and solve them with (accelerated)
gradient descent or Newton's method.
- Linear models (exercises 4–7):
This is the centrepiece of the class. I am introducing the “scientific method for machines”,
i.e., the machine learning workflow, and we are going to run through it multiple times while cranking
up the complexity.
The core analysis technique used is the singular value decomposition. We use it to prove the
bias–variance tradeoff and perform convergence analysis for gradient descent.
- Advanced models (exercises 8–11):
Here we construct artificial neural networks by building them up from logistic models, which are
We then move to unsupervised learning, to make sure that students have seen another paradigm
than supervised learning. We can reuse our SVD analysis to construct low-rank approximations, which
concludes the class.
The idea of this class is a little different from your usual shiny happy pictures class on machine learning:
there is a heavy focus on simple models, in particular the linear model, and rigorous analysis.
There are three reasons for this:
- The linear model can be used to justify a lot of the “common wisdom” in machine learning
rigorously and at same time pedagogically. I found that actually being able to understand and prove, e.g.,
convergence behaviours is as important as observing it in-the-wild.
- From my experience, students approach a machine learning problem in physics from “the wrong end”,
i.e., by throwing the most sophisticated model at it without much a-priori analysis instead of doing
sophisticated a-priori analysis and then start with the simplest model that is reasonable.
- The convex cost functions means linear and logistic models are straight-forward to train. I found writing
simple training codes yourself greatly helps “demystifying” machine learning libraries.