Equi-normalization of Neural Networks
Abstract
Modern neural networks are over-parametrized. In particular, each rectified linear hidden unit can be modified by a multiplicative factor by adjusting input and output weights, without changing the rest of the network. Inspired by the Sinkhorn-Knopp algorithm, we introduce a fast iterative method for minimizing the L2 norm of the weights, equivalently the weight decay regularizer. It provably converges to a unique solution. Interleaving our algorithm with SGD during training improves the test accuracy. For small batches, our approach offers an alternative to batch-and group-normalization on CIFAR-10 and ImageNet with a ResNet-18.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2019
- DOI:
- 10.48550/arXiv.1902.10416
- arXiv:
- arXiv:1902.10416
- Bibcode:
- 2019arXiv190210416S
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning
- E-Print:
- ICLR 2019 camera-ready