English

List-Decodable Sparse Mean Estimation

Machine Learning 2022-12-07 v2

Abstract

Robust mean estimation is one of the most important problems in statistics: given a set of samples in Rd\mathbb{R}^d where an α\alpha fraction are drawn from some distribution DD and the rest are adversarially corrupted, we aim to estimate the mean of DD. A surge of recent research interest has been focusing on the list-decodable setting where α(0,12]\alpha \in (0, \frac12], and the goal is to output a finite number of estimates among which at least one approximates the target mean. In this paper, we consider that the underlying distribution DD is Gaussian with kk-sparse mean. Our main contribution is the first polynomial-time algorithm that enjoys sample complexity O(poly(k,logd))O\big(\mathrm{poly}(k, \log d)\big), i.e. poly-logarithmic in the dimension. One of our core algorithmic ingredients is using low-degree sparse polynomials to filter outliers, which may find more applications.

Keywords

Cite

@article{arxiv.2205.14337,
  title  = {List-Decodable Sparse Mean Estimation},
  author = {Shiwei Zeng and Jie Shen},
  journal= {arXiv preprint arXiv:2205.14337},
  year   = {2022}
}

Comments

v2 introduces low-degree polynomials to improve error rate, and is accepted to NeurIPS 2022