A survey of deep learning optimizers -- first and second order methods

doi:10.48550/arXiv.2211.15596

A survey of deep learning optimizers -- first and second order methods

Kashyap, Rohan

Deep Learning optimization involves minimizing a high-dimensional loss function in the weight space which is often perceived as difficult due to its inherent difficulties such as saddle points, local minima, ill-conditioning of the Hessian and limited compute resources. In this paper, we provide a comprehensive review of $14$ standard optimization methods successfully used in deep learning research and a theoretical assessment of the difficulties in numerical optimization from the optimization literature.

Publication:

arXiv e-prints

Pub Date:

November 2022

DOI:

10.48550/arXiv.2211.15596

arXiv:

arXiv:2211.15596

Bibcode:

2022arXiv221115596K

Keywords:

Computer Science - Machine Learning;
Computer Science - Computer Vision and Pattern Recognition;
Mathematics - Optimization and Control

NASA/ADS

A survey of deep learning optimizers -- first and second order methods

Abstract