Last-Iterate Convergence of Saddle-Point Optimizers via High-Resolution Differential Equations
Abstract
Several widely-used first-order saddle-point optimization methods yield an identical continuous-time ordinary differential equation (ODE) that is identical to that of the Gradient Descent Ascent (GDA) method when derived naively. However, the convergence properties of these methods are qualitatively different, even on simple bilinear games. Thus the ODE perspective, which has proved powerful in analyzing single-objective optimization methods, has not played a similar role in saddle-point optimization. We adopt a framework studied in fluid dynamics -- known as High-Resolution Differential Equations (HRDEs) -- to design differential equation models for several saddle-point optimization methods. Critically, these HRDEs are distinct for various saddle-point optimization methods. Moreover, in bilinear games, the convergence properties of the HRDEs match the qualitative features of the corresponding discrete methods. Additionally, we show that the HRDE of Optimistic Gradient Descent Ascent (OGDA) exhibits \emph{last-iterate convergence} for general monotone variational inequalities. Finally, we provide rates of convergence for the \emph{best-iterate convergence} of the OGDA method, relying solely on the first-order smoothness of the monotone operator.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2021
- DOI:
- arXiv:
- arXiv:2112.13826
- Bibcode:
- 2021arXiv211213826C
- Keywords:
-
- Mathematics - Optimization and Control;
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- Minimax Theory 8, Number 2 (2023) 333--380