Collaborative Learning for Deep Neural Networks
Abstract
We introduce collaborative learning in which multiple classifier heads of the same network are simultaneously trained on the same training data to improve generalization and robustness to label noise with no extra inference cost. It acquires the strengths from auxiliary training, multi-task learning and knowledge distillation. There are two important mechanisms involved in collaborative learning. First, the consensus of multiple views from different classifier heads on the same example provides supplementary information as well as regularization to each classifier, thereby improving generalization. Second, intermediate-level representation (ILR) sharing with backpropagation rescaling aggregates the gradient flows from all heads, which not only reduces training computational complexity, but also facilitates supervision to the shared layers. The empirical results on CIFAR and ImageNet datasets demonstrate that deep neural networks learned as a group in a collaborative way significantly reduce the generalization error and increase the robustness to label noise.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2018
- arXiv:
- arXiv:1805.11761
- Bibcode:
- 2018arXiv180511761S
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning
- E-Print:
- To appear in NIPS 2018