On Calibration of EnsembleBased Credal Predictors
Abstract
In recent years, several classification methods that intend to quantify epistemic uncertainty have been proposed, either by producing predictions in the form of secondorder distributions or sets of probability distributions. In this work, we focus on the latter, also called credal predictors, and address the question of how to evaluate them: What does it mean that a credal predictor represents epistemic uncertainty in a faithful manner? To answer this question, we refer to the notion of calibration of probabilistic predictors and extend it to credal predictors. Broadly speaking, we call a credal predictor calibrated if it returns sets that cover the true conditional probability distribution. To verify this property for the important case of ensemblebased credal predictors, we propose a novel nonparametric calibration test that generalizes an existing test for probabilistic predictors to the case of credal predictors. Making use of this test, we empirically show that credal predictors based on deep neural networks are often not well calibrated.
 Publication:

arXiv eprints
 Pub Date:
 May 2022
 arXiv:
 arXiv:2205.10082
 Bibcode:
 2022arXiv220510082M
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Machine Learning