Augmented sparse principal component analysis for high dimensional data
Abstract
We study the problem of estimating the leading eigenvectors of a high-dimensional population covariance matrix based on independent Gaussian observations. We establish lower bounds on the rates of convergence of the estimators of the leading eigenvectors under $l^q$-sparsity constraints when an $l^2$ loss function is used. We also propose an estimator of the leading eigenvectors based on a coordinate selection scheme combined with PCA and show that the proposed estimator achieves the optimal rate of convergence under a sparsity regime. Moreover, we establish that under certain scenarios, the usual PCA achieves the minimax convergence rate.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2012
- DOI:
- 10.48550/arXiv.1202.1242
- arXiv:
- arXiv:1202.1242
- Bibcode:
- 2012arXiv1202.1242P
- Keywords:
-
- Mathematics - Statistics Theory;
- Statistics - Methodology;
- 62G20 (Primary) 62H25 (Secondary)
- E-Print:
- This manuscript was written in 2007, and a version has been available on the first author's website, but it is posted to arXiv now in its 2007 form. Revisions incorporating later work will be posted separately