Why Do Local Methods Solve Nonconvex Problems?

doi:10.48550/arXiv.2103.13462

Why Do Local Methods Solve Nonconvex Problems?

Ma, Tengyu

Non-convex optimization is ubiquitous in modern machine learning. Researchers devise non-convex objective functions and optimize them using off-the-shelf optimizers such as stochastic gradient descent and its variants, which leverage the local geometry and update iteratively. Even though solving non-convex functions is NP-hard in the worst case, the optimization quality in practice is often not an issue -- optimizers are largely believed to find approximate global minima. Researchers hypothesize a unified explanation for this intriguing phenomenon: most of the local minima of the practically-used objectives are approximately global minima. We rigorously formalize it for concrete instances of machine learning problems.

Publication:

arXiv e-prints

Pub Date:

March 2021

DOI:

10.48550/arXiv.2103.13462

arXiv:

arXiv:2103.13462

Bibcode:

2021arXiv210313462M

Keywords:

Computer Science - Machine Learning;
Computer Science - Data Structures and Algorithms;
Mathematics - Optimization and Control;
Statistics - Machine Learning

E-Print:

This is the Chapter 21 of the book "Beyond the Worst-Case Analysis of Algorithms"

NASA/ADS

Why Do Local Methods Solve Nonconvex Problems?

Abstract