Universal characteristics of deep neural network loss surfaces from random matrix theory
Abstract
This paper considers several aspects of random matrix universality in deep neural networks. Motivated by recent experimental work, we use universal properties of random matrices related to local statistics to derive practical implications for deep neural networks based on a realistic model of their Hessians. In particular we derive universal aspects of outliers in the spectra of deep neural networks and demonstrate the important role of random matrix local laws in popular preconditioning gradient descent algorithms. We also present insights into deep neural network loss surfaces from quite general arguments based on tools from statistical physics and random matrix theory.
 Publication:

arXiv eprints
 Pub Date:
 May 2022
 arXiv:
 arXiv:2205.08601
 Bibcode:
 2022arXiv220508601B
 Keywords:

 Mathematical Physics;
 Condensed Matter  Disordered Systems and Neural Networks;
 Computer Science  Machine Learning
 EPrint:
 42 pages