Semisupervised Classifier Evaluation and Recalibration

doi:10.48550/arXiv.1210.2162

Semisupervised Classifier Evaluation and Recalibration

How many labeled examples are needed to estimate a classifier's performance on a new dataset? We study the case where data is plentiful, but labels are expensive. We show that by making a few reasonable assumptions on the structure of the data, it is possible to estimate performance curves, with confidence bounds, using a small number of ground truth labels. Our approach, which we call Semisupervised Performance Evaluation (SPE), is based on a generative model for the classifier's confidence scores. In addition to estimating the performance of classifiers on new datasets, SPE can be used to recalibrate a classifier by re-estimating the class-conditional confidence distributions.

Publication:

arXiv e-prints

Pub Date:

October 2012

DOI:

10.48550/arXiv.1210.2162

arXiv:

arXiv:1210.2162

Bibcode:

2012arXiv1210.2162W

Keywords:

Computer Science - Machine Learning;
Computer Science - Computer Vision and Pattern Recognition

NASA/ADS

Semisupervised Classifier Evaluation and Recalibration

Abstract