A Robust Initialization of Residual Blocks for Effective ResNet Training without Batch Normalization
Abstract
Batch Normalization is an essential component of all state-of-the-art neural networks architectures. However, since it introduces many practical issues, much recent research has been devoted to designing normalization-free architectures. In this paper, we show that weights initialization is key to train ResNet-like normalization-free networks. In particular, we propose a slight modification to the summation operation of a block output to the skip-connection branch, so that the whole network is correctly initialized. We show that this modified architecture achieves competitive results on CIFAR-10, CIFAR-100 and ImageNet without further regularization nor algorithmic modifications.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2021
- DOI:
- 10.48550/arXiv.2112.12299
- arXiv:
- arXiv:2112.12299
- Bibcode:
- 2021arXiv211212299C
- Keywords:
-
- Computer Science - Machine Learning
- E-Print:
- 16 pages (4 pages of supplementary material), 9 figures, 2 table