English

Sparse principal component analysis and iterative thresholding

Statistics Theory 2013-05-27 v2 Methodology Statistics Theory

Abstract

Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features p is comparable to, or even much larger than, the sample size n. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, we find that the new approach recovers the principal subspace and leading eigenvectors consistently, and even optimally, in a range of high-dimensional sparse settings. Simulated examples also demonstrate its competitive performance.

Keywords

Cite

@article{arxiv.1112.2432,
  title  = {Sparse principal component analysis and iterative thresholding},
  author = {Zongming Ma},
  journal= {arXiv preprint arXiv:1112.2432},
  year   = {2013}
}

Comments

Published in at http://dx.doi.org/10.1214/13-AOS1097 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

R2 v1 2026-06-21T19:49:31.348Z