English

Faster Algorithms for Testing under Conditional Sampling

Data Structures and Algorithms 2015-04-17 v1 Computational Complexity Machine Learning Statistics Theory Statistics Theory

Abstract

There has been considerable recent interest in distribution-tests whose run-time and sample requirements are sublinear in the domain-size kk. We study two of the most important tests under the conditional-sampling model where each query specifies a subset SS of the domain, and the response is a sample drawn from SS according to the underlying distribution. For identity testing, which asks whether the underlying distribution equals a specific given distribution or ϵ\epsilon-differs from it, we reduce the known time and sample complexities from O~(ϵ4)\tilde{\mathcal{O}}(\epsilon^{-4}) to O~(ϵ2)\tilde{\mathcal{O}}(\epsilon^{-2}), thereby matching the information theoretic lower bound. For closeness testing, which asks whether two distributions underlying observed data sets are equal or different, we reduce existing complexity from O~(ϵ4log5k)\tilde{\mathcal{O}}(\epsilon^{-4} \log^5 k) to an even sub-logarithmic O~(ϵ5loglogk)\tilde{\mathcal{O}}(\epsilon^{-5} \log \log k) thus providing a better bound to an open problem in Bertinoro Workshop on Sublinear Algorithms [Fisher, 2004].

Keywords

Cite

@article{arxiv.1504.04103,
  title  = {Faster Algorithms for Testing under Conditional Sampling},
  author = {Moein Falahatgar and Ashkan Jafarpour and Alon Orlitsky and Venkatadheeraj Pichapathi and Ananda Theertha Suresh},
  journal= {arXiv preprint arXiv:1504.04103},
  year   = {2015}
}

Comments

31 pages

R2 v1 2026-06-22T09:16:57.482Z