Selective machine learning of doubly robust functionals

doi:10.48550/arXiv.1911.02029

Selective machine learning of doubly robust functionals

While model selection is a well-studied topic in parametric and nonparametric regression or density estimation, selection of possibly high-dimensional nuisance parameters in semiparametric problems is far less developed. In this paper, we propose a selective machine learning framework for making inferences about a finite-dimensional functional defined on a semiparametric model, when the latter admits a doubly robust estimating function and several candidate machine learning algorithms are available for estimating the nuisance parameters. We introduce a new selection criterion aimed at bias reduction in estimating the functional of interest based on a novel definition of pseudo-risk inspired by the double robustness property. Intuitively, the proposed criterion selects a pair of learners with the smallest pseudo-risk, so that the estimated functional is least sensitive to perturbations of a nuisance parameter. We establish an oracle property for a multi-fold cross-validation version of the new selection criterion which states that our empirical criterion performs nearly as well as an oracle with a priori knowledge of the pseudo-risk for each pair of candidate learners. Finally, we apply the approach to model selection of a semiparametric estimator of average treatment effect given an ensemble of candidate machine learners to account for confounding in an observational study which we illustrate in simulations and a data application.

Publication:

arXiv e-prints

Pub Date:

November 2019

DOI:

10.48550/arXiv.1911.02029

arXiv:

arXiv:1911.02029

Bibcode:

2019arXiv191102029C

Keywords:

Statistics - Methodology;
Mathematics - Statistics Theory;
Statistics - Machine Learning

E-Print:

To appear in Biometrika

NASA/ADS

Selective machine learning of doubly robust functionals

Abstract