Machine Learning for Columnar High Energy Physics Analysis
Abstract
Machine learning (ML) has become an integral component of high energy physics data analyses and is likely to continue to grow in prevalence. Physicists are incorporating ML into many aspects of analysis, from using boosted decision trees to classify particle jets to using unsupervised learning to search for physics beyond the Standard Model. Since ML methods have become so widespread in analysis and these analyses need to be scaled up for HL-LHC data, neatly integrating ML training and inference into scalable analysis workflows will improve the user experience of analysis in the HL-LHC era. We present the integration of ML training and inference into the IRIS-HEP Analysis Grand Challenge (AGC) pipeline to provide an example of how this integration can look like in a realistic analysis environment. We also utilize Open Data to ensure the project's reach to the broader community. Different approaches for performing ML inference at analysis facilities are investigated and compared, including performing inference through external servers. Since ML techniques are applied for many different types of tasks in physics analyses, we showcase options for ML integration that can be applied to various inference needs.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2024
- DOI:
- 10.48550/arXiv.2401.01802
- arXiv:
- arXiv:2401.01802
- Bibcode:
- 2024arXiv240101802K
- Keywords:
-
- High Energy Physics - Experiment
- E-Print:
- Submitted as CHEP 2023 conference proceedings to EPJ (European Physical Journal)