TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm
Abstract
Approximating distributions from their samples is a canonical statistical-learning problem. One of its most powerful and successful modalities approximates every distribution to an distance essentially at most a constant times larger than its closest -piece degree- polynomial, where and . Letting denote the smallest such factor, clearly , and it can be shown that for all other and . Yet current computationally efficient algorithms show only and the bound rises quickly to for . We derive a near-linear-time and essentially sample-optimal estimator that establishes for all . Additionally, for many practical distributions, the lowest approximation distance is achieved by polynomials with vastly varying number of pieces. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation. Experiments combining the two techniques confirm improved performance over existing methodologies.
Cite
@article{arxiv.2202.07172,
title = {TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm},
author = {Yi Hao and Ayush Jain and Alon Orlitsky and Vaishakh Ravindrakumar},
journal= {arXiv preprint arXiv:2202.07172},
year = {2022}
}
Comments
19 pages, 12 figures