Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Abstract
We investigate the test risk of continuous-time stochastic gradient flow dynamics in learning theory. Using a path integral formulation we provide, in the regime of a small learning rate, a general formula for computing the difference between test risk curves of pure gradient and stochastic gradient flows. We apply the general theory to a simple model of weak features, which displays the double descent phenomenon, and explicitly compute the corrections brought about by the added stochastic term in the dynamics, as a function of time and model parameters. The analytical results are compared to simulations of discrete-time stochastic gradient descent and show good agreement.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2024
- DOI:
- arXiv:
- arXiv:2402.07626
- Bibcode:
- 2024arXiv240207626V
- Keywords:
-
- Statistics - Machine Learning;
- Condensed Matter - Disordered Systems and Neural Networks;
- Computer Science - Machine Learning
- E-Print:
- Accepted to ICML 2024