On the Calibration of Probabilistic Classifier Sets

doi:10.48550/arXiv.2205.10082

On the Calibration of Probabilistic Classifier Sets

Multi-class classification methods that produce sets of probabilistic classifiers, such as ensemble learning methods, are able to model aleatoric and epistemic uncertainty. Aleatoric uncertainty is then typically quantified via the Bayes error, and epistemic uncertainty via the size of the set. In this paper, we extend the notion of calibration, which is commonly used to evaluate the validity of the aleatoric uncertainty representation of a single probabilistic classifier, to assess the validity of an epistemic uncertainty representation obtained by sets of probabilistic classifiers. Broadly speaking, we call a set of probabilistic classifiers calibrated if one can find a calibrated convex combination of these classifiers. To evaluate this notion of calibration, we propose a novel nonparametric calibration test that generalizes an existing test for single probabilistic classifiers to the case of sets of probabilistic classifiers. Making use of this test, we empirically show that ensembles of deep neural networks are often not well calibrated.

Publication:

arXiv e-prints

Pub Date:

May 2022

DOI:

10.48550/arXiv.2205.10082

arXiv:

arXiv:2205.10082

Bibcode:

2022arXiv220510082M

Keywords:

Statistics - Machine Learning;
Computer Science - Machine Learning

NASA/ADS

On the Calibration of Probabilistic Classifier Sets

Abstract