A survey of deep learning optimizers -- first and second order methods
Abstract
Deep Learning optimization involves minimizing a high-dimensional loss function in the weight space which is often perceived as difficult due to its inherent difficulties such as saddle points, local minima, ill-conditioning of the Hessian and limited compute resources. In this paper, we provide a comprehensive review of $14$ standard optimization methods successfully used in deep learning research and a theoretical assessment of the difficulties in numerical optimization from the optimization literature.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2022
- DOI:
- 10.48550/arXiv.2211.15596
- arXiv:
- arXiv:2211.15596
- Bibcode:
- 2022arXiv221115596K
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Computer Vision and Pattern Recognition;
- Mathematics - Optimization and Control