Augmented sparse principal component analysis for high dimensional data

doi:10.48550/arXiv.1202.1242

Augmented sparse principal component analysis for high dimensional data

We study the problem of estimating the leading eigenvectors of a high-dimensional population covariance matrix based on independent Gaussian observations. We establish lower bounds on the rates of convergence of the estimators of the leading eigenvectors under $l^q$-sparsity constraints when an $l^2$ loss function is used. We also propose an estimator of the leading eigenvectors based on a coordinate selection scheme combined with PCA and show that the proposed estimator achieves the optimal rate of convergence under a sparsity regime. Moreover, we establish that under certain scenarios, the usual PCA achieves the minimax convergence rate.

Publication:

arXiv e-prints

Pub Date:

February 2012

DOI:

10.48550/arXiv.1202.1242

arXiv:

arXiv:1202.1242

Bibcode:

2012arXiv1202.1242P

Keywords:

Mathematics - Statistics Theory;
Statistics - Methodology;
62G20 (Primary) 62H25 (Secondary)

E-Print:

This manuscript was written in 2007, and a version has been available on the first author's website, but it is posted to arXiv now in its 2007 form. Revisions incorporating later work will be posted separately

NASA/ADS

Augmented sparse principal component analysis for high dimensional data

Abstract