Large-scale probabilistic predictors with and without guarantees of validity
Abstract
This paper studies theoretically and empirically a method of turning machine-learning algorithms into probabilistic predictors that automatically enjoys a property of validity (perfect calibration) and is computationally efficient. The price to pay for perfect calibration is that these probabilistic predictors produce imprecise (in practice, almost precise for large data sets) probabilities. When these imprecise probabilities are merged into precise probabilities, the resulting predictors, while losing the theoretical property of perfect calibration, are consistently more accurate than the existing methods in empirical studies.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2015
- DOI:
- 10.48550/arXiv.1511.00213
- arXiv:
- arXiv:1511.00213
- Bibcode:
- 2015arXiv151100213V
- Keywords:
-
- Computer Science - Machine Learning;
- 68T05
- E-Print:
- 38 pages, 14 figures, to appear in Advances in Neural Information Processing Systems 28 (NIPS 2015). As compared with the previous version (v1), the MATLAB code (the 5 files with extension .m) and results of new empirical studies have been added