Theory of the Frequency Principle for General Deep Neural Networks

doi:10.48550/arXiv.1906.09235

Theory of the Frequency Principle for General Deep Neural Networks

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training. The F-Principle has been very useful in providing both qualitative and quantitative understandings of DNNs. In this paper, we rigorously investigate the F-Principle for the training dynamics of a general DNN at three stages: initial stage, intermediate stage, and final stage. For each stage, a theorem is provided in terms of proper quantities characterizing the F-Principle. Our results are general in the sense that they work for multilayer networks with general activation functions, population densities of data, and a large class of loss functions. Our work lays a theoretical foundation of the F-Principle for a better understanding of the training process of DNNs.

Publication:

arXiv e-prints

Pub Date:

June 2019

DOI:

10.48550/arXiv.1906.09235

arXiv:

arXiv:1906.09235

Bibcode:

2019arXiv190609235L

Keywords:

Computer Science - Machine Learning;
Mathematics - Optimization and Control;
Statistics - Machine Learning;
68Q32;
68T01;
I.2.6

E-Print:

under review

NASA/ADS

Theory of the Frequency Principle for General Deep Neural Networks

Abstract