Large deviation analysis of function sensitivity in random deep neural networks
Abstract
Mean field theory has been successfully used to analyze deep neural networks (DNN) in the infinite size limit. Given the finite size of realistic DNN, we utilize the large deviation theory and path integral analysis to study the deviation of functions represented by DNN from their typical mean field solutions. The parameter perturbations investigated include weight sparsification (dilution) and binarization, which are commonly used in model simplification, for both ReLU and sign activation functions. We find that random networks with ReLU activation are more robust to parameter perturbations with respect to their counterparts with sign activation, which arguably is reflected in the simplicity of the functions they generate.
 Publication:

Journal of Physics A Mathematical General
 Pub Date:
 March 2020
 DOI:
 10.1088/17518121/ab6a6f
 arXiv:
 arXiv:1910.05769
 Bibcode:
 2020JPhA...53j4002L
 Keywords:

 large deviation theory;
 path integral;
 deep neural networks;
 function sensitivity;
 Condensed Matter  Disordered Systems and Neural Networks;
 Computer Science  Machine Learning;
 Statistics  Machine Learning
 EPrint:
 J. Phys. A: Math. Theor. 53. 104002 (2020)