Think Global, Act Local: Relating DNN generalisation and node-level SNR
Abstract
The reasons behind good DNN generalisation remain an open question. In this paper we explore the problem by looking at the Signal-to-Noise Ratio of nodes in the network. Starting from information theory principles, it is possible to derive an expression for the SNR of a DNN node output. Using this expression we construct figures-of-merit that quantify how well the weights of a node optimise SNR (or, equivalently, information rate). Applying these figures-of-merit, we give examples indicating that weight sets that promote good SNR performance also exhibit good generalisation. In addition, we are able to identify the qualities of weight sets that exhibit good SNR behaviour and hence promote good generalisation. This leads to a discussion of how these results relate to network training and regularisation. Finally, we identify some ways that these observations can be used in training design.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2020
- DOI:
- 10.48550/arXiv.2002.04687
- arXiv:
- arXiv:2002.04687
- Bibcode:
- 2020arXiv200204687N
- Keywords:
-
- Computer Science - Machine Learning;
- Electrical Engineering and Systems Science - Signal Processing;
- Statistics - Machine Learning
- E-Print:
- 15 pages, 5 figures