Set-valued classification -- overview via a unified framework

doi:10.48550/arXiv.2102.12318

Set-valued classification -- overview via a unified framework

Multi-class classification problem is among the most popular and well-studied statistical frameworks. Modern multi-class datasets can be extremely ambiguous and single-output predictions fail to deliver satisfactory performance. By allowing predictors to predict a set of label candidates, set-valued classification offers a natural way to deal with this ambiguity. Several formulations of set-valued classification are available in the literature and each of them leads to different prediction strategies. The present survey aims to review popular formulations using a unified statistical framework. The proposed framework encompasses previously considered and leads to new formulations as well as it allows to understand underlying trade-offs of each formulation. We provide infinite sample optimal set-valued classification strategies and review a general plug-in principle to construct data-driven algorithms. The exposition is supported by examples and pointers to both theoretical and practical contributions. Finally, we provide experiments on real-world datasets comparing these approaches in practice and providing general practical guidelines.

Publication:

arXiv e-prints

Pub Date:

February 2021

DOI:

10.48550/arXiv.2102.12318

arXiv:

arXiv:2102.12318

Bibcode:

2021arXiv210212318C

Keywords:

Statistics - Machine Learning;
Computer Science - Machine Learning

NASA/ADS

Set-valued classification -- overview via a unified framework

Abstract