Q-learning in Dynamic Treatment Regimes with Misclassified Binary Outcome
Abstract
The study of precision medicine involves dynamic treatment regimes (DTRs), which are sequences of treatment decision rules recommended by taking patient-level information as input. The primary goal of the DTR study is to identify an optimal DTR, a sequence of treatment decision rules that leads to the best expected clinical outcome. Statistical methods have been developed in recent years to estimate an optimal DTR, including Q-learning, a regression-based method in the DTR literature. Although there are many studies concerning Q-learning, little attention has been given in the presence of noisy data, such as misclassified outcomes. In this paper, we investigate the effect of outcome misclassification on Q-learning and propose a correction method to accommodate the misclassification effect. Simulation studies are conducted to demonstrate the satisfactory performance of the proposed method. We illustrate the proposed method in two examples from the National Health and Nutrition Examination Survey Data I Epidemiologic Follow-up Study and the smoking cessation program.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2024
- DOI:
- 10.48550/arXiv.2404.04697
- arXiv:
- arXiv:2404.04697
- Bibcode:
- 2024arXiv240404697L
- Keywords:
-
- Statistics - Methodology