Robustness Against Weak or Invalid Instruments: Exploring Nonlinear Treatment Models with Machine Learning
Abstract
We discuss causal inference for observational studies with possibly invalid instrumental variables. We propose a novel methodology called two-stage curvature identification (TSCI) by exploring the nonlinear treatment model with machine learning. {The first-stage machine learning enables improving the instrumental variable's strength and adjusting for different forms of violating the instrumental variable assumptions.} The success of TSCI requires the instrumental variable's effect on treatment to differ from its violation form. A novel bias correction step is implemented to remove bias resulting from the potentially high complexity of machine learning. Our proposed \texttt{TSCI} estimator is shown to be asymptotically unbiased and Gaussian even if the machine learning algorithm does not consistently estimate the treatment model. Furthermore, we design a data-dependent method to choose the best among several candidate violation forms. We apply TSCI to study the effect of education on earnings.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2022
- DOI:
- arXiv:
- arXiv:2203.12808
- Bibcode:
- 2022arXiv220312808G
- Keywords:
-
- Statistics - Methodology;
- Mathematics - Statistics Theory;
- Statistics - Machine Learning