Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks
Abstract
We provide several new results on the sample complexity of vector-valued linear predictors (parameterized by a matrix), and more generally neural networks. Focusing on size-independent bounds, where only the Frobenius norm distance of the parameters from some fixed reference matrix $W_0$ is controlled, we show that the sample complexity behavior can be surprisingly different than what we may expect considering the well-studied setting of scalar-valued linear predictors. This also leads to new sample complexity bounds for feed-forward neural networks, tackling some open questions in the literature, and establishing a new convex linear prediction problem that is provably learnable without uniform convergence.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2023
- DOI:
- arXiv:
- arXiv:2305.16475
- Bibcode:
- 2023arXiv230516475M
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- 30 pages