Stochastic Gradient Methods with Preconditioned Updates

doi:10.48550/arXiv.2206.00285

Stochastic Gradient Methods with Preconditioned Updates

This work considers the non-convex finite sum minimization problem. There are several algorithms for such problems, but existing methods often work poorly when the problem is badly scaled and/or ill-conditioned, and a primary goal of this work is to introduce methods that alleviate this issue. Thus, here we include a preconditioner based on Hutchinson's approach to approximating the diagonal of the Hessian, and couple it with several gradient-based methods to give new scaled algorithms: Scaled SARAH and Scaled L-SVRG. Theoretical complexity guarantees under smoothness assumptions are presented. We prove linear convergence when both smoothness and the PL condition are assumed. Our adaptively scaled methods use approximate partial second-order curvature information and, therefore, can better mitigate the impact of badly scaled problems. This improved practical performance is demonstrated in the numerical experiments also presented in this work.

Publication:

arXiv e-prints

Pub Date:

June 2022

DOI:

10.48550/arXiv.2206.00285

arXiv:

arXiv:2206.00285

Bibcode:

2022arXiv220600285S

Keywords:

Mathematics - Optimization and Control;
Computer Science - Machine Learning

E-Print:

40 pages, 2 new algorithms, 20 figures, 4 tables

NASA/ADS

Stochastic Gradient Methods with Preconditioned Updates

Abstract