English

Sharp Bounds for Generalized Uniformity Testing

Data Structures and Algorithms 2017-09-08 v1 Information Theory Machine Learning math.IT Statistics Theory Statistics Theory

Abstract

We study the problem of generalized uniformity testing \cite{BC17} of a discrete probability distribution: Given samples from a probability distribution pp over an {\em unknown} discrete domain Ω\mathbf{\Omega}, we want to distinguish, with probability at least 2/32/3, between the case that pp is uniform on some {\em subset} of Ω\mathbf{\Omega} versus ϵ\epsilon-far, in total variation distance, from any such uniform distribution. We establish tight bounds on the sample complexity of generalized uniformity testing. In more detail, we present a computationally efficient tester whose sample complexity is optimal, up to constant factors, and a matching information-theoretic lower bound. Specifically, we show that the sample complexity of generalized uniformity testing is Θ(1/(ϵ4/3p3)+1/(ϵ2p2))\Theta\left(1/(\epsilon^{4/3}\|p\|_3) + 1/(\epsilon^{2} \|p\|_2) \right).

Keywords

Cite

@article{arxiv.1709.02087,
  title  = {Sharp Bounds for Generalized Uniformity Testing},
  author = {Ilias Diakonikolas and Daniel M. Kane and Alistair Stewart},
  journal= {arXiv preprint arXiv:1709.02087},
  year   = {2017}
}