Why do classifier accuracies show linear trends under distribution shift?
Abstract
Recent studies of generalization in deep learning have observed a puzzling trend: accuracies of models on one data distribution are approximately linear functions of the accuracies on another distribution. We explain this trend under an intuitive assumption on model similarity, which was verified empirically in prior work. More precisely, we assume the probability that two models agree in their predictions is higher than what we can infer from their accuracy levels alone. Then, we show that a linear trend must occur when evaluating models on two distributions unless the size of the distribution shift is large. This work emphasizes the value of understanding model similarity, which can have an impact on the generalization and robustness of classification models.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2020
- DOI:
- 10.48550/arXiv.2012.15483
- arXiv:
- arXiv:2012.15483
- Bibcode:
- 2020arXiv201215483M
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- 18 pages, 13 figures