Towards Sampleefficient Overparameterized Metalearning
Abstract
An overarching goal in machine learning is to build a generalizable model with few samples. To this end, overparameterization has been the subject of immense interest to explain the generalization ability of deep nets even when the size of the dataset is smaller than that of the model. While the prior literature focuses on the classical supervised setting, this paper aims to demystify overparameterization for metalearning. Here we have a sequence of linearregression tasks and we ask: (1) Given earlier tasks, what is the optimal linear representation of features for a new downstream task? and (2) How many samples do we need to build this representation? This work shows that surprisingly, overparameterization arises as a natural answer to these fundamental metalearning questions. Specifically, for (1), we first show that learning the optimal representation coincides with the problem of designing a taskaware regularization to promote inductive bias. We leverage this inductive bias to explain how the downstream task actually benefits from overparameterization, in contrast to prior works on fewshot learning. For (2), we develop a theory to explain how feature covariance can implicitly help reduce the sample complexity well below the degrees of freedom and lead to small estimation error. We then integrate these findings to obtain an overall performance guarantee for our metalearning algorithm. Numerical experiments on real and synthetic data verify our insights on overparameterized metalearning.
 Publication:

arXiv eprints
 Pub Date:
 January 2022
 DOI:
 10.48550/arXiv.2201.06142
 arXiv:
 arXiv:2201.06142
 Bibcode:
 2022arXiv220106142S
 Keywords:

 Computer Science  Machine Learning;
 Statistics  Machine Learning
 EPrint:
 Advances in Neural Information Processing Systems, 34 (2021)