English

High-dimensional analysis of semidefinite relaxations for sparse principal components

Statistics Theory 2009-08-26 v2 Information Theory math.IT Statistics Theory

Abstract

Principal component analysis (PCA) is a classical method for dimensionality reduction based on extracting the dominant eigenvectors of the sample covariance matrix. However, PCA is well known to behave poorly in the ``large pp, small nn'' setting, in which the problem dimension pp is comparable to or larger than the sample size nn. This paper studies PCA in this high-dimensional regime, but under the additional assumption that the maximal eigenvector is sparse, say, with at most kk nonzero components. We consider a spiked covariance model in which a base matrix is perturbed by adding a kk-sparse maximal eigenvector, and we analyze two computationally tractable methods for recovering the support set of this maximal eigenvector, as follows: (a) a simple diagonal thresholding method, which transitions from success to failure as a function of the rescaled sample size θdia(n,p,k)=n/[k2log(pk)]\theta_{\mathrm{dia}}(n,p,k)=n/[k^2\log(p-k)]; and (b) a more sophisticated semidefinite programming (SDP) relaxation, which succeeds once the rescaled sample size θsdp(n,p,k)=n/[klog(pk)]\theta_{\mathrm{sdp}}(n,p,k)=n/[k\log(p-k)] is larger than a critical threshold. In addition, we prove that no method, including the best method which has exponential-time complexity, can succeed in recovering the support if the order parameter θsdp(n,p,k)\theta_{\mathrm{sdp}}(n,p,k) is below a threshold. Our results thus highlight an interesting trade-off between computational and statistical efficiency in high-dimensional inference.

Keywords

Cite

@article{arxiv.0803.4026,
  title  = {High-dimensional analysis of semidefinite relaxations for sparse principal components},
  author = {Arash A. Amini and Martin J. Wainwright},
  journal= {arXiv preprint arXiv:0803.4026},
  year   = {2009}
}

Comments

Published in at http://dx.doi.org/10.1214/08-AOS664 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

R2 v1 2026-06-21T10:25:11.351Z