An investigation of over-training within semi-supervised machine learning models in the search for heavy resonances at the LHC

doi:10.48550/arXiv.2109.07287

An investigation of over-training within semi-supervised machine learning models in the search for heavy resonances at the LHC

In particle physics, semi-supervised machine learning is an attractive option to reduce model dependencies searches beyond the Standard Model. When utilizing semi-supervised techniques in training machine learning models in the search for bosons at the Large Hadron Collider, the over-training of the model must be investigated. Internal fluctuations of the phase space and bias in training can cause semi-supervised models to label false signals within the phase space due to over-fitting. The issue of false signal generation in semi-supervised models has not been fully analyzed and therefore utilizing a toy Monte Carlo model, the probability of such situations occurring must be quantified. This investigation of $Z\gamma$ resonances is performed using a pure background Monte Carlo sample. Through unique pure background samples extracted to mimic ATLAS data in a background-plus-signal region, multiple runs enable the probability of these fake signals occurring due to over-training to be thoroughly investigated.

Publication:

arXiv e-prints

Pub Date:

September 2021

DOI:

10.48550/arXiv.2109.07287

arXiv:

arXiv:2109.07287

Bibcode:

2021arXiv210907287L

Keywords:

High Energy Physics - Experiment;
High Energy Physics - Phenomenology

E-Print:

6 pages, 3 figures, proceedings submitted to SAIP2021

NASA/ADS

An investigation of over-training within semi-supervised machine learning models in the search for heavy resonances at the LHC

Abstract