English

Efficient Density Estimation via Piecewise Polynomial Approximation

Machine Learning 2013-05-15 v1 Data Structures and Algorithms Machine Learning

Abstract

We give a highly efficient "semi-agnostic" algorithm for learning univariate probability distributions that are well approximated by piecewise polynomial density functions. Let pp be an arbitrary distribution over an interval II which is τ\tau-close (in total variation distance) to an unknown probability distribution qq that is defined by an unknown partition of II into tt intervals and tt unknown degree-dd polynomials specifying qq over each of the intervals. We give an algorithm that draws O~(t\new(d+1)/\eps2)\tilde{O}(t\new{(d+1)}/\eps^2) samples from pp, runs in time \poly(t,d,1/\eps)\poly(t,d,1/\eps), and with high probability outputs a piecewise polynomial hypothesis distribution hh that is (O(τ)+\eps)(O(\tau)+\eps)-close (in total variation distance) to pp. This sample complexity is essentially optimal; we show that even for τ=0\tau=0, any algorithm that learns an unknown tt-piecewise degree-dd probability distribution over II to accuracy \eps\eps must use Ω(t(d+1)\poly(1+log(d+1))1\eps2)\Omega({\frac {t(d+1)} {\poly(1 + \log(d+1))}} \cdot {\frac 1 {\eps^2}}) samples from the distribution, regardless of its running time. Our algorithm combines tools from approximation theory, uniform convergence, linear programming, and dynamic programming. We apply this general algorithm to obtain a wide range of results for many natural problems in density estimation over both continuous and discrete domains. These include state-of-the-art results for learning mixtures of log-concave distributions; mixtures of tt-modal distributions; mixtures of Monotone Hazard Rate distributions; mixtures of Poisson Binomial Distributions; mixtures of Gaussians; and mixtures of kk-monotone densities. Our general technique yields computationally efficient algorithms for all these problems, in many cases with provably optimal sample complexities (up to logarithmic factors) in all parameters.

Keywords

Cite

@article{arxiv.1305.3207,
  title  = {Efficient Density Estimation via Piecewise Polynomial Approximation},
  author = {Siu-On Chan and Ilias Diakonikolas and Rocco A. Servedio and Xiaorui Sun},
  journal= {arXiv preprint arXiv:1305.3207},
  year   = {2013}
}
R2 v1 2026-06-22T00:16:24.149Z