II — Probability, Geometry & Learning → Chapter 8
FROM SYSTEMS TO FRONTIER ML

What 'learning' actually is

Loss, gradient descent, SGD/Adam, regularization (then→now), overfitting. Modern terminology vs your 2007 version.

§1 Loss functions & empirical risk §2 Gradient descent & SGD §3 Momentum, Adam, AdamW

← ALL CHAPTERS