Weakly supervised classification in high energy physics
Abstract
As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. This paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics — quark versus gluon tagging — we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervised classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available.
- Publication:
-
Journal of High Energy Physics
- Pub Date:
- May 2017
- DOI:
- 10.1007/JHEP05(2017)145
- arXiv:
- arXiv:1702.00414
- Bibcode:
- 2017JHEP...05..145D
- Keywords:
-
- Jets;
- High Energy Physics - Phenomenology;
- Physics - Data Analysis;
- Statistics and Probability;
- Statistics - Machine Learning
- E-Print:
- 8 pages, 4 figures