English
Related papers

Related papers: TURF: A Two-factor, Universal, Robust, Fast Distri…

200 papers

Sample- and computationally-efficient distribution estimation is a fundamental tenet in statistics and machine learning. We present SURF, an algorithm for approximating distributions by piecewise polynomials. SURF is: simple, replacing…

Machine Learning · Statistics 2021-02-15 Yi Hao , Ayush Jain , Alon Orlitsky , Vaishakh Ravindrakumar

We design a new, fast algorithm for agnostically learning univariate probability distributions whose densities are well approximated by piecewise polynomial functions. Let $f$ be the density function of an arbitrary univariate distribution,…

Data Structures and Algorithms · Computer Science 2015-06-03 Jayadev Acharya , Ilias Diakonikolas , Jerry Li , Ludwig Schmidt

We study the {\em robust proper learning} of univariate log-concave distributions (over continuous and discrete domains). Given a set of samples drawn from an unknown target distribution, we want to compute a log-concave hypothesis…

Data Structures and Algorithms · Computer Science 2016-06-10 Ilias Diakonikolas , Daniel M. Kane , Alistair Stewart

We resolve a long-standing open question, about the existence of a constant-factor approximation algorithm for the average-case \textsc{Decision Tree} problem with uniform probability distribution over the hypotheses. We answer the question…

Data Structures and Algorithms · Computer Science 2026-04-29 Michał Szyfelbein

We give a highly efficient "semi-agnostic" algorithm for learning univariate probability distributions that are well approximated by piecewise polynomial density functions. Let $p$ be an arbitrary distribution over an interval $I$ which is…

Machine Learning · Computer Science 2013-05-15 Siu-On Chan , Ilias Diakonikolas , Rocco A. Servedio , Xiaorui Sun

We give a deterministic algorithm for approximately counting satisfying assignments of a degree-$d$ polynomial threshold function (PTF). Given a degree-$d$ input polynomial $p(x_1,\dots,x_n)$ over $R^n$ and a parameter $\epsilon> 0$, our…

Computational Complexity · Computer Science 2013-12-02 Anindya De , Rocco Servedio

A $k$-modal probability distribution over the discrete domain $\{1,...,n\}$ is one whose histogram has at most $k$ "peaks" and "valleys." Such distributions are natural generalizations of monotone ($k=0$) and unimodal ($k=1$) probability…

Data Structures and Algorithms · Computer Science 2014-09-16 Constantinos Daskalakis , Ilias Diakonikolas , Rocco A. Servedio

Finding a maximum cut is a fundamental task in many computational settings. Surprisingly, it has been insufficiently studied in the classic distributed settings, where vertices communicate by synchronously sending messages to their…

Data Structures and Algorithms · Computer Science 2017-07-27 Keren Censor-Hillel , Rina Levy , Hadas Shachnai

Analyzing high-dimensional data with manifold learning algorithms often requires searching for the nearest neighbors of all observations. This presents a computational bottleneck in statistical manifold learning when observations of…

Machine Learning · Computer Science 2022-03-11 Fan Cheng , Anastasios Panagiotelis , Rob J Hyndman

Let $p$ be an unknown and arbitrary probability distribution over $[0,1)$. We consider the problem of {\em density estimation}, in which a learning algorithm is given i.i.d. draws from $p$ and must (with high probability) output a…

Machine Learning · Computer Science 2014-11-04 Siu-On Chan , Ilias Diakonikolas , Rocco A. Servedio , Xiaorui Sun

We give a general unified method that can be used for $L_1$ {\em closeness testing} of a wide range of univariate structured distribution families. More specifically, we design a sample optimal and computationally efficient algorithm for…

Data Structures and Algorithms · Computer Science 2015-08-25 Ilias Diakonikolas , Daniel M. Kane , Vladimir Nikishkin

We design efficient distance approximation algorithms for several classes of structured high-dimensional distributions. Specifically, we show algorithms for the following problems: - Given sample access to two Bayesian networks $P_1$ and…

Data Structures and Algorithms · Computer Science 2020-02-17 Arnab Bhattacharyya , Sutanu Gayen , Kuldeep S. Meel , N. V. Vinodchandran

In the hypothesis selection problem, we are given sample and query access to finite set of candidate distributions (hypotheses), $\mathcal{H} = \{H_1, \ldots, H_n\}$, and samples from an unknown distribution $P$, both over a domain…

Data Structures and Algorithms · Computer Science 2025-11-12 Anders Aamand , Maryam Aliakbarpour , Justin Y. Chen , Sandeep Silwal

The seminar assignment problem is a variant of the generalized assignment problem in which items have unit size and the amount of space allowed in each bin is restricted to an arbitrary set of values. The problem has been shown to be…

Data Structures and Algorithms · Computer Science 2016-10-18 Amotz Bar-Noy , George Rabanca

Consider the following problem: given two arbitrary densities $q_1,q_2$ and a sample-access to an unknown target density $p$, find which of the $q_i$'s is closer to $p$ in total variation. A remarkable result due to Yatracos shows that this…

Machine Learning · Computer Science 2025-12-16 Olivier Bousquet , Daniel Kane , Shay Moran

A fundamental notion of distance between train and test distributions from the field of domain adaptation is discrepancy distance. While in general hard to compute, here we provide the first set of provably efficient algorithms for testing…

Data Structures and Algorithms · Computer Science 2024-06-14 Gautam Chandrasekaran , Adam R. Klivans , Vasilis Kontonis , Konstantinos Stavropoulos , Arsen Vasilyan

We study the problem of robustly learning multi-dimensional histograms. A $d$-dimensional function $h: D \rightarrow \mathbb{R}$ is called a $k$-histogram if there exists a partition of the domain $D \subseteq \mathbb{R}^d$ into $k$…

Machine Learning · Computer Science 2018-02-26 Ilias Diakonikolas , Jerry Li , Ludwig Schmidt

Diffusion is a fundamental graph process, underpinning such phenomena as epidemic disease contagion and the spread of innovation by word-of-mouth. We address the algorithmic problem of finding a set of k initial seed nodes in a network so…

Data Structures and Algorithms · Computer Science 2016-06-23 Christian Borgs , Michael Brautbar , Jennifer Chayes , Brendan Lucier

We propose a simple subsampling scheme for fast randomized approximate computation of optimal transport distances. This scheme operates on a random subset of the full data and can use any exact algorithm as a black-box back-end, including…

Computation · Statistics 2020-12-17 Max Sommerfeld , Jörn Schrieber , Yoav Zemel , Axel Munk

Many popular learning algorithms (E.g. Regression, Fourier-Transform based algorithms, Kernel SVM and Kernel ridge regression) operate by reducing the problem to a convex optimization problem over a vector space of functions. These methods…

Machine Learning · Computer Science 2014-05-13 Amit Daniely , Nati Linial , Shai Shalev-Shwartz
‹ Prev 1 2 3 10 Next ›