Finding SecondOrder Stationary Point for NonconvexStronglyConcave Minimax Problem
Abstract
We study the smooth minimax optimization problem of the form $\min_{\bf x}\max_{\bf y} f({\bf x},{\bf y})$, where the objective function is stronglyconcave in ${\bf y}$ but possibly nonconvex in ${\bf x}$. This problem includes a lot of applications in machine learning such as regularized GAN, reinforcement learning and adversarial training. Most of existing theory related to gradient descent accent focus on establishing the convergence result for achieving the firstorder stationary point of $f({\bf x},{\bf y})$ or primal function $P({\bf x})\triangleq \max_{\bf y} f({\bf x},{\bf y})$. In this paper, we design a new optimization method via cubic Newton iterations, which could find an ${\mathcal O}\left(\varepsilon,\kappa^{1.5}\sqrt{\rho\varepsilon}\right)$secondorder stationary point of $P({\bf x})$ with ${\mathcal O}\left(\kappa^{1.5}\sqrt{\rho}\varepsilon^{1.5}\right)$ secondorder oracle calls and $\tilde{\mathcal O}\left(\kappa^{2}\sqrt{\rho}\varepsilon^{1.5}\right)$ firstorder oracle calls, where $\kappa$ is the condition number and $\rho$ is the Hessian smoothness coefficient of $f({\bf x},{\bf y})$. For highdimensional problems, we propose an variant algorithm to avoid expensive cost form secondorder oracle, which solves the cubic subproblem inexactly via gradient descent and matrix Chebyshev expansion. This strategy still obtains desired approximate secondorder stationary point with high probability but only requires $\tilde{\mathcal O}\left(\kappa^{1.5}\ell\varepsilon^{2}\right)$ Hessianvector oracle and $\tilde{\mathcal O}\left(\kappa^{2}\sqrt{\rho}\varepsilon^{1.5}\right)$ firstorder oracle calls. To the best of our knowledge, this is the first work considers nonasymptotic convergence behavior of finding secondorder stationary point for minimax problem without convexconcave assumption.
 Publication:

arXiv eprints
 Pub Date:
 October 2021
 arXiv:
 arXiv:2110.04814
 Bibcode:
 2021arXiv211004814L
 Keywords:

 Mathematics  Optimization and Control;
 Computer Science  Machine Learning