Estimating Generalization under Distribution Shifts via Domain-Invariant Representations

doi:10.48550/arXiv.2007.03511

Estimating Generalization under Distribution Shifts via Domain-Invariant Representations

When machine learning models are deployed on a test distribution different from the training distribution, they can perform poorly, but overestimate their performance. In this work, we aim to better estimate a model's performance under distribution shift, without supervision. To do so, we use a set of domain-invariant predictors as a proxy for the unknown, true target labels. Since the error of the resulting risk estimate depends on the target risk of the proxy model, we study generalization of domain-invariant representations and show that the complexity of the latent representation has a significant influence on the target risk. Empirically, our approach (1) enables self-tuning of domain adaptation models, and (2) accurately estimates the target error of given models under distribution shift. Other applications include model selection, deciding early stopping and error detection.

Publication:

arXiv e-prints

Pub Date:

July 2020

DOI:

10.48550/arXiv.2007.03511

arXiv:

arXiv:2007.03511

Bibcode:

2020arXiv200703511C

Keywords:

Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

arXiv admin note: text overlap with arXiv:1910.05804

NASA/ADS

Estimating Generalization under Distribution Shifts via Domain-Invariant Representations

Abstract