On dissipative symplectic integration with applications to gradient-based optimization
Abstract
Recently, continuous-time dynamical systems have proved useful in providing conceptual and quantitative insights into gradient-based optimization, widely used in modern machine learning and statistics. An important question that arises in this line of work is how to discretize the system in such a way that its stability and rates of convergence are preserved. In this paper we propose a geometric framework in which such discretizations can be realized systematically, enabling the derivation of 'rate-matching' algorithms without the need for a discrete convergence analysis. More specifically, we show that a generalization of symplectic integrators to non-conservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error. Moreover, such methods preserve a shadow Hamiltonian despite the absence of a conservation law, extending key results of symplectic integrators to non-conservative cases. Our arguments rely on a combination of backward error analysis with fundamental results from symplectic geometry. We stress that although the original motivation for this work was the application to optimization, where dissipative systems play a natural role, they are fully general and not only provide a differential geometric framework for dissipative Hamiltonian systems but also substantially extend the theory of structure-preserving integration.
- Publication:
-
Journal of Statistical Mechanics: Theory and Experiment
- Pub Date:
- April 2021
- DOI:
- 10.1088/1742-5468/abf5d4
- arXiv:
- arXiv:2004.06840
- Bibcode:
- 2021JSMTE2021d3402F
- Keywords:
-
- machine learning;
- optimization under uncertainty;
- analysis of algorithms;
- Mathematics - Optimization and Control;
- Condensed Matter - Disordered Systems and Neural Networks;
- Condensed Matter - Statistical Mechanics;
- Statistics - Machine Learning
- E-Print:
- matches published version