The Selectively Adaptive Lasso
Abstract
Machine learning regression methods allow estimation of functions without unrealistic parametric assumptions. Although they can perform exceptionally in prediction error, most lack theoretical convergence rates necessary for semiparametric efficient estimation (e.g. TMLE, AIPW) of parameters like average treatment effects. The Highly Adaptive Lasso (HAL) is the only regression method proven to converge quickly enough for a meaningfully large class of functions, independent of the dimensionality of the predictors. Unfortunately, HAL is not computationally scalable. In this paper we build upon the theory of HAL to construct the Selectively Adaptive Lasso (SAL), a new algorithm which retains HAL's dimensionfree, nonparametric convergence rate but which also scales computationally to massive datasets. To accomplish this, we prove some general theoretical results pertaining to empirical loss minimization in nested Donsker classes. Our resulting algorithm is a form of gradient tree boosting with an adaptive learning rate, which makes it fast and trivial to implement with offtheshelf software. Finally, we show that our algorithm retains the performance of standard gradient boosting on a diverse group of realworld datasets. SAL makes semiparametric efficient estimators practically possible and theoretically justifiable in many big data settings.
 Publication:

arXiv eprints
 Pub Date:
 May 2022
 arXiv:
 arXiv:2205.10697
 Bibcode:
 2022arXiv220510697S
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Machine Learning;
 Mathematics  Statistics Theory