Uncertainty in Lung Cancer Stage for Outcome Estimation via Set-Valued Classification
Abstract
Difficulty in identifying cancer stage in health care claims data has limited oncology quality of care and health outcomes research. We fit prediction algorithms for classifying lung cancer stage into three classes (stages I/II, stage III, and stage IV) using claims data, and then demonstrate a method for incorporating the classification uncertainty in outcomes estimation. Leveraging set-valued classification and split conformal inference, we show how a fixed algorithm developed in one cohort of data may be deployed in another, while rigorously accounting for uncertainty from the initial classification step. We demonstrate this process using SEER cancer registry data linked with Medicare claims data.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2021
- DOI:
- 10.48550/arXiv.2107.01251
- arXiv:
- arXiv:2107.01251
- Bibcode:
- 2021arXiv210701251B
- Keywords:
-
- Statistics - Applications
- E-Print:
- Code available at: https://github.com/sl-bergquist/cancer_classification