Designing a Linearized Potential Function in Neural Network Optimization Using Csiszár Type of Tsallis Entropy
Abstract
In recent years, learning for neural networks can be viewed as optimization in the space of probability measures. To obtain the exponential convergence to the optimizer, the regularizing term based on Shannon entropy plays an important role. Even though an entropy function heavily affects convergence results, there is almost no result on its generalization, because of the following two technical difficulties: one is the lack of sufficient condition for generalized logarithmic Sobolev inequality, and the other is the distributional dependence of the potential function within the gradient flow equation. In this paper, we establish a framework that utilizes a linearized potential function via Csiszár type of Tsallis entropy, which is one of the generalized entropies. We also show that our new framework enable us to derive an exponential convergence result.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2024
- DOI:
- arXiv:
- arXiv:2411.03611
- Bibcode:
- 2024arXiv241103611A
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning;
- Mathematics - Analysis of PDEs;
- 35Q49;
- 49J20;
- 82C32