Rare Event Classification with Weighted Logistic Regression for Identifying Repeating Fast Radio Bursts
Abstract
An important task in the study of fast radio bursts (FRBs) remains the automatic classification of repeating and non-repeating sources based on their morphological properties. We propose a statistical model that considers a modified logistic regression to classify FRB sources. The classical logistic regression model is modified to accommodate the small proportion of repeaters in the data, a feature that is likely due to the sampling procedure and duration and is not a characteristic of the population of FRB sources. The weighted logistic regression hinges on the choice of a tuning parameter that represents the true proportion $\tau$ of repeating FRB sources in the entire population. The proposed method has a sound statistical foundation, direct interpretability, and operates with only 5 parameters, enabling quicker retraining with added data. Using the CHIME/FRB Collaboration sample of repeating and non-repeating FRBs and numerical experiments, we achieve a classification accuracy for repeaters of nearly 75\% or higher when $\tau$ is set in the range of $50$ to $60$\%. This implies a tentative high proportion of repeaters, which is surprising, but is also in agreement with recent estimates of $\tau$ that are obtained using other methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2024
- DOI:
- 10.48550/arXiv.2410.17474
- arXiv:
- arXiv:2410.17474
- Bibcode:
- 2024arXiv241017474H
- Keywords:
-
- Astrophysics - High Energy Astrophysical Phenomena;
- Astrophysics - Instrumentation and Methods for Astrophysics;
- Statistics - Applications
- E-Print:
- 16 pages, 7 figures. Submitted to ApJ