Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery

doi:10.48550/arXiv.2405.07552

Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery

In this paper, we focus on distributed estimation and support recovery for high-dimensional linear quantile regression. Quantile regression is a popular alternative tool to the least squares regression for robustness against outliers and data heterogeneity. However, the non-smoothness of the check loss function poses big challenges to both computation and theory in the distributed setting. To tackle these problems, we transform the original quantile regression into the least-squares optimization. By applying a double-smoothing approach, we extend a previous Newton-type distributed approach without the restrictive independent assumption between the error term and covariates. An efficient algorithm is developed, which enjoys high computation and communication efficiency. Theoretically, the proposed distributed estimator achieves a near-oracle convergence rate and high support recovery accuracy after a constant number of iterations. Extensive experiments on synthetic examples and a real data application further demonstrate the effectiveness of the proposed method.

Publication:

arXiv e-prints

Pub Date:

May 2024

DOI:

10.48550/arXiv.2405.07552

arXiv:

arXiv:2405.07552

Bibcode:

2024arXiv240507552W

Keywords:

Statistics - Machine Learning;
Computer Science - Machine Learning;
Statistics - Methodology

E-Print:

Forty-first International Conference on Machine Learning (ICML 2024), 27 pages, 4 figures, 14 tables

NASA/ADS

Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery

Abstract