Statistical limits of spiked tensor models
Abstract
We study the statistical limits of both detecting and estimating a rankone deformation of a symmetric random Gaussian tensor. We establish upper and lower bounds on the critical signaltonoise ratio, under a variety of priors for the planted vector: (i) a uniformly sampled unit vector, (ii) i.i.d. $\pm 1$ entries, and (iii) a sparse vector where a constant fraction $\rho$ of entries are i.i.d. $\pm 1$ and the rest are zero. For each of these cases, our upper and lower bounds match up to a $1+o(1)$ factor as the order $d$ of the tensor becomes large. For sparse signals (iii), our bounds are also asymptotically tight in the sparse limit $\rho \to 0$ for any fixed $d$ (including the $d=2$ case of sparse PCA). Our upper bounds for (i) demonstrate a phenomenon reminiscent of the work of Baik, Ben Arous and Péché: an `eigenvalue' of a perturbed tensor emerges from the bulk at a strictly lower signaltonoise ratio than when the perturbation itself exceeds the bulk; we quantify the size of this effect. We also provide some general results for larger classes of priors. In particular, the large $d$ asymptotics of the threshold location differs between problems with discrete priors versus continuous priors. Finally, for priors (i) and (ii) we carry out the replica prediction from statistical physics, which is conjectured to give the exact informationtheoretic threshold for any fixed $d$. Of independent interest, we introduce a new improvement to the second moment method for contiguity, on which our lower bounds are based. Our technique conditions away from rare `bad' events that depend on interactions between the signal and noise. This enables us to close $\sqrt{2}$factor gaps present in several previous works.
 Publication:

arXiv eprints
 Pub Date:
 December 2016
 DOI:
 10.48550/arXiv.1612.07728
 arXiv:
 arXiv:1612.07728
 Bibcode:
 2016arXiv161207728P
 Keywords:

 Mathematics  Probability;
 Computer Science  Information Theory;
 Mathematics  Statistics Theory;
 Statistics  Machine Learning
 EPrint:
 39 pages, 5 figures