Bootstrap Inference when Using Multiple Imputation
Abstract
Many modern estimators require bootstrapping to calculate confidence intervals because either no analytic standard error is available or the distribution of the parameter of interest is non-symmetric. It remains however unclear how to obtain valid bootstrap inference when dealing with multiple imputation to address missing data. We present four methods which are intuitively appealing, easy to implement, and combine bootstrap estimation with multiple imputation. We show that three of the four approaches yield valid inference, but that the performance of the methods varies with respect to the number of imputed data sets and the extent of missingness. Simulation studies reveal the behavior of our approaches in finite samples. A topical analysis from HIV treatment research, which determines the optimal timing of antiretroviral treatment initiation in young children, demonstrates the practical implications of the four methods in a sophisticated and realistic setting. This analysis suffers from missing data and uses the $g$-formula for inference, a method for which no standard errors are available.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2016
- DOI:
- arXiv:
- arXiv:1602.07933
- Bibcode:
- 2016arXiv160207933S
- Keywords:
-
- Statistics - Methodology
- E-Print:
- doi:10.1002/sim.7654