First-order and second-order variants of the gradient descent in a unified framework
Abstract
In this paper, we provide an overview of first-order and second-order variants of the gradient descent method that are commonly used in machine learning. We propose a general framework in which 6 of these variants can be interpreted as different instances of the same approach. They are the vanilla gradient descent, the classical and generalized Gauss-Newton methods, the natural gradient descent method, the gradient covariance matrix approach, and Newton's method. Besides interpreting these methods within a single framework, we explain their specificities and show under which conditions some of them coincide.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2018
- DOI:
- 10.48550/arXiv.1810.08102
- arXiv:
- arXiv:1810.08102
- Bibcode:
- 2018arXiv181008102P
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- 13 pages