On nearly assumptionfree tests of nominal confidence interval coverage for causal parameters estimated by machine learning
Abstract
For many causal effect parameters of interest, doubly robust machine learning (DRML) estimators $\hat{\psi}_{1}$ are the stateoftheart, incorporating the good prediction performance of machine learning; the decreased bias of doubly robust estimators; and the analytic tractability and bias reduction of sample splitting with cross fitting. Nonetheless, even in the absence of confounding by unmeasured factors, the nominal $(1  \alpha)$ Wald confidence interval $\hat{\psi}_{1} \pm z_{\alpha / 2} \widehat{\mathsf{se}} [\hat{\psi}_{1}]$ may still undercover even in large samples, because the bias of $\hat{\psi}_{1}$ may be of the same or even larger order than its standard error of order $n^{1/2}$. In this paper, we introduce essentially assumptionfree tests that (i) can falsify the null hypothesis that the bias of $\hat{\psi}_{1}$ is of smaller order than its standard error, (ii) can provide an upper confidence bound on the true coverage of the Wald interval, and (iii) are valid under the null under no smoothness/sparsity assumptions on the nuisance parameters. The tests, which we refer to as \underline{A}ssumption \underline{F}ree \underline{E}mpirical \underline{C}overage \underline{T}ests (AFECTs), are based on a Ustatistic that estimates part of the bias of $\hat{\psi}_{1}$.
 Publication:

arXiv eprints
 Pub Date:
 April 2019
 DOI:
 10.48550/arXiv.1904.04276
 arXiv:
 arXiv:1904.04276
 Bibcode:
 2019arXiv190404276L
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Machine Learning;
 Mathematics  Statistics Theory;
 Statistics  Methodology
 EPrint:
 Significant updates from the previous version. In press in Statistical Science