ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Abstract
We present ISAAC (Input-baSed ApproximAte Curvature), a novel method that conditions the gradient using selected second-order information and has an asymptotically vanishing computational overhead, assuming a batch size smaller than the number of neurons. We show that it is possible to compute a good conditioner based on only the input to a respective layer without a substantial computational overhead. The proposed method allows effective training even in small-batch stochastic regimes, which makes it competitive to first-order as well as second-order methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2023
- DOI:
- 10.48550/arXiv.2305.00604
- arXiv:
- arXiv:2305.00604
- Bibcode:
- 2023arXiv230500604P
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Computer Vision and Pattern Recognition;
- Mathematics - Optimization and Control;
- Statistics - Machine Learning
- E-Print:
- Published at ICLR 2023, Code @ https://github.com/Felix-Petersen/isaac, Video @ https://youtu.be/7RKRX-MdwqM