On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions
Abstract
The alternating gradient descent (AGD) is a simple but popular algorithm which has been applied to problems in optimization, machine learning, data ming, and signal processing, etc. The algorithm updates two blocks of variables in an alternating manner, in which a gradient step is taken on one block, while keeping the remaining block fixed. When the objective function is nonconvex, it is wellknown the AGD converges to the firstorder stationary solution with a global sublinear rate. In this paper, we show that a variant of AGDtype algorithms will not be trapped by "bad" stationary solutions such as saddle points and local maximum points. In particular, we consider a smooth unconstrained optimization problem, and propose a perturbed AGD (PAGD) which converges (with high probability) to the set of secondorder stationary solutions (SS2) with a global sublinear rate. To the best of our knowledge, this is the first alternating type algorithm which takes $\mathcal{O}(\text{polylog}(d)/\epsilon^{7/3})$ iterations to achieve SS2 with high probability [where polylog$(d)$ is polynomial of the logarithm of dimension $d$ of the problem].
 Publication:

arXiv eprints
 Pub Date:
 February 2018
 DOI:
 10.48550/arXiv.1802.10418
 arXiv:
 arXiv:1802.10418
 Bibcode:
 2018arXiv180210418L
 Keywords:

 Mathematics  Optimization and Control;
 Computer Science  Information Theory;
 Statistics  Machine Learning