English

Augmented sparse principal component analysis for high dimensional data

Statistics Theory 2012-02-07 v1 Methodology Statistics Theory

Abstract

We study the problem of estimating the leading eigenvectors of a high-dimensional population covariance matrix based on independent Gaussian observations. We establish lower bounds on the rates of convergence of the estimators of the leading eigenvectors under lql^q-sparsity constraints when an l2l^2 loss function is used. We also propose an estimator of the leading eigenvectors based on a coordinate selection scheme combined with PCA and show that the proposed estimator achieves the optimal rate of convergence under a sparsity regime. Moreover, we establish that under certain scenarios, the usual PCA achieves the minimax convergence rate.

Keywords

Cite

@article{arxiv.1202.1242,
  title  = {Augmented sparse principal component analysis for high dimensional data},
  author = {Debashis Paul and Iain M. Johnstone},
  journal= {arXiv preprint arXiv:1202.1242},
  year   = {2012}
}

Comments

This manuscript was written in 2007, and a version has been available on the first author's website, but it is posted to arXiv now in its 2007 form. Revisions incorporating later work will be posted separately

R2 v1 2026-06-21T20:15:37.091Z