English

Subexponential-Time Algorithms for Sparse PCA

Statistics Theory 2022-06-24 v3 Computational Complexity Data Structures and Algorithms Machine Learning Statistics Theory

Abstract

We study the computational cost of recovering a unit-norm sparse principal component xRnx \in \mathbb{R}^n planted in a random matrix, in either the Wigner or Wishart spiked model (observing either W+λxxW + \lambda xx^\top with WW drawn from the Gaussian orthogonal ensemble, or NN independent samples from N(0,In+βxx)\mathcal{N}(0, I_n + \beta xx^\top), respectively). Prior work has shown that when the signal-to-noise ratio (λ\lambda or βN/n\beta\sqrt{N/n}, respectively) is a small constant and the fraction of nonzero entries in the planted vector is x0/n=ρ\|x\|_0 / n = \rho, it is possible to recover xx in polynomial time if ρ1/n\rho \lesssim 1/\sqrt{n}. While it is possible to recover xx in exponential time under the weaker condition ρ1\rho \ll 1, it is believed that polynomial-time recovery is impossible unless ρ1/n\rho \lesssim 1/\sqrt{n}. We investigate the precise amount of time required for recovery in the "possible but hard" regime 1/nρ11/\sqrt{n} \ll \rho \ll 1 by exploring the power of subexponential-time algorithms, i.e., algorithms running in time exp(nδ)\exp(n^\delta) for some constant δ(0,1)\delta \in (0,1). For any 1/nρ11/\sqrt{n} \ll \rho \ll 1, we give a recovery algorithm with runtime roughly exp(ρ2n)\exp(\rho^2 n), demonstrating a smooth tradeoff between sparsity and runtime. Our family of algorithms interpolates smoothly between two existing algorithms: the polynomial-time diagonal thresholding algorithm and the exp(ρn)\exp(\rho n)-time exhaustive search algorithm. Furthermore, by analyzing the low-degree likelihood ratio, we give rigorous evidence suggesting that the tradeoff achieved by our algorithms is optimal.

Keywords

Cite

@article{arxiv.1907.11635,
  title  = {Subexponential-Time Algorithms for Sparse PCA},
  author = {Yunzi Ding and Dmitriy Kunisky and Alexander S. Wein and Afonso S. Bandeira},
  journal= {arXiv preprint arXiv:1907.11635},
  year   = {2022}
}

Comments

44 pages

R2 v1 2026-06-23T10:32:07.478Z