Statistical Jump Model for Mixed-Type Data with Missing Data Imputation
Abstract
In this paper, we address the challenge of clustering mixed-type data with temporal evolution by introducing the statistical jump model for mixed-type data. This novel framework incorporates regime persistence, enhancing interpretability and reducing the frequency of state switches, and efficiently handles missing data. The model is easily interpretable through its state-conditional means and modes, making it accessible to practitioners and policymakers. We validate our approach through extensive simulation studies and an empirical application to air quality data, demonstrating its superiority in inferring persistent air quality regimes compared to the traditional air quality index. Our contributions include a robust method for mixed-type temporal clustering, effective missing data management, and practical insights for environmental monitoring.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2024
- DOI:
- 10.48550/arXiv.2409.01208
- arXiv:
- arXiv:2409.01208
- Bibcode:
- 2024arXiv240901208C
- Keywords:
-
- Statistics - Methodology;
- Statistics - Applications;
- Statistics - Machine Learning;
- 37M10;
- 62D10;
- 62H30
- E-Print:
- 25 pages, 5 figures, 9 tables