On the principal components of sample covariance matrices
Abstract
We introduce a class of $M \times M$ sample covariance matrices $\mathcal Q$ which subsumes and generalizes several previous models. The associated population covariance matrix $\Sigma = \mathbb E \cal Q$ is assumed to differ from the identity by a matrix of bounded rank. All quantities except the rank of $\Sigma - I_M$ may depend on $M$ in an arbitrary fashion. We investigate the principal components, i.e.\ the top eigenvalues and eigenvectors, of $\mathcal Q$. We derive precise large deviation estimates on the generalized components $\langle \mathbf w, \boldsymbol \xi_i \rangle$ of the outlier and non-outlier eigenvectors $\boldsymbol \xi_i$. Our results also hold near the so-called BBP transition, where outliers are created or annihilated, and for degenerate or near-degenerate outliers. We believe the obtained rates of convergence to be optimal. In addition, we derive the asymptotic distribution of the generalized components of the non-outlier eigenvectors. A novel observation arising from our results is that, unlike the eigenvalues, the eigenvectors of the principal components contain information about the \emph{subcritical} spikes of $\Sigma$. The proofs use several results on the eigenvalues and eigenvectors of the uncorrelated matrix $\mathcal Q$, satisfying $\mathbb E \mathcal Q = I_M$, as input: the isotropic local Marchenko-Pastur law established in [9], level repulsion, and quantum unique ergodicity of the eigenvectors. The latter is a special case of a new universality result for the joint eigenvalue-eigenvector distribution.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2014
- DOI:
- 10.48550/arXiv.1404.0788
- arXiv:
- arXiv:1404.0788
- Bibcode:
- 2014arXiv1404.0788B
- Keywords:
-
- Mathematics - Probability;
- Mathematical Physics;
- Mathematics - Statistics Theory;
- 15B52;
- 60B20;
- 82B44