Sharp Variable Selection of a Sparse Submatrix in a High-Dimensional Noisy Matrix
Abstract
We observe a $N\times M$ matrix of independent, identically distributed Gaussian random variables which are centered except for elements of some submatrix of size $n\times m$ where the mean is larger than some $a>0$. The submatrix is sparse in the sense that $n/N$ and $m/M$ tend to 0, whereas $n,\, m, \, N$ and $M$ tend to infinity. We consider the problem of selecting the random variables with significantly large mean values. We give sufficient conditions on $a$ as a function of $n,\, m,\,N$ and $M$ and construct a uniformly consistent procedure in order to do sharp variable selection. We also prove the minimax lower bounds under necessary conditions which are complementary to the previous conditions. The critical values $a^*$ separating the necessary and sufficient conditions are sharp (we show exact constants). We note a gap between the critical values $a^*$ for selection of variables and that of detecting that such a submatrix exists given by Butucea and Ingster (2012). When $a^*$ is in this gap, consistent detection is possible but no consistent selector of the corresponding variables can be found.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2013
- DOI:
- 10.48550/arXiv.1303.5647
- arXiv:
- arXiv:1303.5647
- Bibcode:
- 2013arXiv1303.5647B
- Keywords:
-
- Mathematics - Statistics Theory;
- 62C20;
- 62G05;
- 62G20