A Closer Look at Double Backpropagation
Abstract
In recent years, an increasing number of neural network models have included derivatives with respect to inputs in their loss functions, resulting in so-called double backpropagation for first-order optimization. However, so far no general description of the involved derivatives exists. Here, we cover a wide array of special cases in a very general Hilbert space framework, which allows us to provide optimized backpropagation rules for many real-world scenarios. This includes the reduction of calculations for Frobenius-norm-penalties on Jacobians by roughly a third for locally linear activation functions. Furthermore, we provide a description of the discontinuous loss surface of ReLU networks both in the inputs and the parameters and demonstrate why the discontinuities do not pose a big problem in reality.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2019
- DOI:
- 10.48550/arXiv.1906.06637
- arXiv:
- arXiv:1906.06637
- Bibcode:
- 2019arXiv190606637E
- Keywords:
-
- Computer Science - Machine Learning;
- Mathematics - Optimization and Control;
- Statistics - Machine Learning
- E-Print:
- 16 pages, 7 figures