Conditional Mutual Information-Based Generalization Bound for Meta Learning

doi:10.48550/arXiv.2010.10886

Conditional Mutual Information-Based Generalization Bound for Meta Learning

Meta-learning optimizes an inductive bias---typically in the form of the hyperparameters of a base-learning algorithm---by observing data from a finite number of related tasks. This paper presents an information-theoretic bound on the generalization performance of any given meta-learner, which builds on the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020). In the proposed extension to meta-learning, the CMI bound involves a training \textit{meta-supersample} obtained by first sampling $2N$ independent tasks from the task environment, and then drawing $2M$ independent training samples for each sampled task. The meta-training data fed to the meta-learner is modelled as being obtained by randomly selecting $N$ tasks from the available $2N$ tasks and $M$ training samples per task from the available $2M$ training samples per task. The resulting bound is explicit in two CMI terms, which measure the information that the meta-learner output and the base-learner output provide about which training data are selected, given the entire meta-supersample. Finally, we present a numerical example that illustrates the merits of the proposed bound in comparison to prior information-theoretic bounds for meta-learning.

Publication:

arXiv e-prints

Pub Date:

October 2020

DOI:

10.48550/arXiv.2010.10886

arXiv:

arXiv:2010.10886

Bibcode:

2020arXiv201010886R

Keywords:

Computer Science - Machine Learning;
Computer Science - Information Theory;
Statistics - Machine Learning

E-Print:

Submitted for conference publication

NASA/ADS

Conditional Mutual Information-Based Generalization Bound for Meta Learning

Abstract