Global Convergence and Generalization Bound of Gradient-Based Meta-Learning with Deep Neural Nets
Abstract
Gradient-based meta-learning (GBML) with deep neural nets (DNNs) has become a popular approach for few-shot learning. However, due to the non-convexity of DNNs and the bi-level optimization in GBML, the theoretical properties of GBML with DNNs remain largely unknown. In this paper, we first aim to answer the following question: Does GBML with DNNs have global convergence guarantees? We provide a positive answer to this question by proving that GBML with over-parameterized DNNs is guaranteed to converge to global optima at a linear rate. The second question we aim to address is: How does GBML achieve fast adaption to new tasks with prior experience on past tasks? To answer it, we theoretically show that GBML is equivalent to a functional gradient descent operation that explicitly propagates experience from the past tasks to new ones, and then we prove a generalization error bound of GBML with over-parameterized DNNs.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2020
- DOI:
- 10.48550/arXiv.2006.14606
- arXiv:
- arXiv:2006.14606
- Bibcode:
- 2020arXiv200614606W
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- Under review. Code available at https://github.com/AI-secure/Meta-Neural-Kernel