Augmented sparse principal component analysis for high dimensional data
Abstract
We study the problem of estimating the leading eigenvectors of a high-dimensional population covariance matrix based on independent Gaussian observations. We establish lower bounds on the rates of convergence of the estimators of the leading eigenvectors under -sparsity constraints when an loss function is used. We also propose an estimator of the leading eigenvectors based on a coordinate selection scheme combined with PCA and show that the proposed estimator achieves the optimal rate of convergence under a sparsity regime. Moreover, we establish that under certain scenarios, the usual PCA achieves the minimax convergence rate.
Cite
@article{arxiv.1202.1242,
title = {Augmented sparse principal component analysis for high dimensional data},
author = {Debashis Paul and Iain M. Johnstone},
journal= {arXiv preprint arXiv:1202.1242},
year = {2012}
}
Comments
This manuscript was written in 2007, and a version has been available on the first author's website, but it is posted to arXiv now in its 2007 form. Revisions incorporating later work will be posted separately