Variance-Reduced Proximal Stochastic Gradient Descent for Non-convex Composite optimization

doi:10.48550/arXiv.1606.00602

Variance-Reduced Proximal Stochastic Gradient Descent for Non-convex Composite optimization

Here we study non-convex composite optimization: first, a finite-sum of smooth but non-convex functions, and second, a general function that admits a simple proximal mapping. Most research on stochastic methods for composite optimization assumes convexity or strong convexity of each function. In this paper, we extend this problem into the non-convex setting using variance reduction techniques, such as prox-SVRG and prox-SAGA. We prove that, with a constant step size, both prox-SVRG and prox-SAGA are suitable for non-convex composite optimization, and help the problem converge to a stationary point within $O(1/\epsilon)$ iterations. That is similar to the convergence rate seen with the state-of-the-art RSAG method and faster than stochastic gradient descent. Our analysis is also extended into the min-batch setting, which linearly accelerates the convergence. To the best of our knowledge, this is the first analysis of convergence rate of variance-reduced proximal stochastic gradient for non-convex composite optimization.

Publication:

arXiv e-prints

Pub Date:

June 2016

DOI:

10.48550/arXiv.1606.00602

arXiv:

arXiv:1606.00602

Bibcode:

2016arXiv160600602Y

Keywords:

Statistics - Machine Learning;
Computer Science - Machine Learning;
Computer Science - Numerical Analysis

E-Print:

This paper has been withdrawn by the author due to an error in the proof of the convergence rate. They will modify this proof as soon as possible

NASA/ADS

Variance-Reduced Proximal Stochastic Gradient Descent for Non-convex Composite optimization

Abstract