Adaptive estimation of High-Dimensional Signal-to-Noise Ratios
Abstract
We consider the equivalent problems of estimating the residual variance, the proportion of explained variance $\eta$ and the signal strength in a high-dimensional linear regression model with Gaussian random design. Our aim is to understand the impact of not knowing the sparsity of the regression parameter and not knowing the distribution of the design on minimax estimation rates of $\eta$. Depending on the sparsity $k$ of the regression parameter, optimal estimators of $\eta$ either rely on estimating the regression parameter or are based on U-type statistics, and have minimax rates depending on $k$. In the important situation where $k$ is unknown, we build an adaptive procedure whose convergence rate simultaneously achieves the minimax risk over all $k$ up to a logarithmic loss which we prove to be non avoidable. Finally, the knowledge of the design distribution is shown to play a critical role. When the distribution of the design is unknown, consistent estimation of explained variance is indeed possible in much narrower regimes than for known design distribution.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2016
- DOI:
- arXiv:
- arXiv:1602.08006
- Bibcode:
- 2016arXiv160208006V
- Keywords:
-
- Statistics - Methodology