Breaking the curse of dimensionality in regression
Abstract
Models with many signals, high-dimensional models, often impose structures on the signal strengths. The common assumption is that only a few signals are strong and most of the signals are zero or close (collectively) to zero. However, such a requirement might not be valid in many real-life applications. In this article, we are interested in conducting large-scale inference in models that might have signals of mixed strengths. The key challenge is that the signals that are not under testing might be collectively non-negligible (although individually small) and cannot be accurately learned. This article develops a new class of tests that arise from a moment matching formulation. A virtue of these moment-matching statistics is their ability to borrow strength across features, adapt to the sparsity size and exert adjustment for testing growing number of hypothesis. GRoup-level Inference of Parameter, GRIP, test harvests effective sparsity structures with hypothesis formulation for an efficient multiple testing procedure. Simulated data showcase that GRIPs error control is far better than the alternative methods. We develop a minimax theory, demonstrating optimality of GRIP for a broad range of models, including those where the model is a mixture of a sparse and high-dimensional dense signals.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2017
- DOI:
- 10.48550/arXiv.1708.00430
- arXiv:
- arXiv:1708.00430
- Bibcode:
- 2017arXiv170800430Z
- Keywords:
-
- Statistics - Methodology;
- Computer Science - Information Theory;
- Mathematics - Statistics Theory;
- Statistics - Machine Learning
- E-Print:
- 51 pages