English
Related papers

Related papers: Faster Algorithms for Testing under Conditional Sa…

200 papers

Recently, there has been significant work studying distribution testing under the Conditional Sampling model. In this model, a query specifies a subset $S$ of the domain, and the output received is a sample drawn from the distribution…

Data Structures and Algorithms · Computer Science 2020-11-05 Shyam Narayanan

A conditional sampling oracle for a probability distribution D returns samples from the conditional distribution of D restricted to a specified subset of the domain. A recent line of work (Chakraborty et al. 2013 and Cannone et al. 2014)…

Data Structures and Algorithms · Computer Science 2016-08-18 Themistoklis Gouleakis , Christos Tzamos , Manolis Zampetakis

In this paper, we consider the problem of testing properties of joint distributions under the Conditional Sampling framework. In the standard sampling model, the sample complexity of testing properties of joint distributions is exponential…

Computational Complexity · Computer Science 2022-08-03 Rishiraj Bhattacharyya , Sourav Chakraborty

A recent model for property testing of probability distributions (Chakraborty et al., ITCS 2013, Canonne et al., SICOMP 2015) enables tremendous savings in the sample complexity of testing algorithms, by allowing them to condition the…

Data Structures and Algorithms · Computer Science 2018-12-10 Jayadev Acharya , Clément L. Canonne , Gautam Kamath

We examine the extent to which sublinear-sample property testing and estimation apply to settings where samples are independently but not identically distributed. Specifically, we consider the following distributional property testing…

Data Structures and Algorithms · Computer Science 2025-11-05 Shivam Garg , Chirag Pabbaraju , Kirankumar Shiragur , Gregory Valiant

We investigate distribution testing with access to non-adaptive conditional samples. In the conditional sampling model, the algorithm is given the following access to a distribution: it submits a query set $S$ to an oracle, which returns a…

Data Structures and Algorithms · Computer Science 2018-11-06 Gautam Kamath , Christos Tzamos

We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution $p$, extensive research has established optimal bounds for uniformity testing,…

Machine Learning · Computer Science 2024-12-03 Maryam Aliakbarpour , Piotr Indyk , Ronitt Rubinfeld , Sandeep Silwal

Equivalence testing, a fundamental problem in the field of distribution testing, seeks to infer if two unknown distributions on $[n]$ are the same or far apart in the total variation distance. Conditional sampling has emerged as a powerful…

Data Structures and Algorithms · Computer Science 2024-03-08 Diptarka Chakraborty , Sourav Chakraborty , Gunjan Kumar , Kuldeep S. Meel

We study the density estimation problem defined as follows: given $k$ distributions $p_1, \ldots, p_k$ over a discrete domain $[n]$, as well as a collection of samples chosen from a ``query'' distribution $q$ over $[n]$, output $p_i$ that…

Data Structures and Algorithms · Computer Science 2024-10-31 Anders Aamand , Alexandr Andoni , Justin Y. Chen , Piotr Indyk , Shyam Narayanan , Sandeep Silwal , Haike Xu

We study the fundamental problems of identity testing (goodness of fit), and closeness testing (two sample test) of distributions over $k$ elements, under differential privacy. While the problems have a long history in statistics, finite…

Machine Learning · Computer Science 2017-11-01 Jayadev Acharya , Ziteng Sun , Huanyu Zhang

We initiate a systematic investigation of distribution testing in the framework of algorithmic replicability. Specifically, given independent samples from a collection of probability distributions, the goal is to characterize the sample…

Machine Learning · Computer Science 2025-07-04 Ilias Diakonikolas , Jingyi Gao , Daniel Kane , Sihan Liu , Christopher Ye

We investigate the problem of testing whether a discrete probability distribution over an ordered domain is a histogram on a specified number of bins. One of the most common tools for the succinct approximation of data, $k$-histograms over…

Data Structures and Algorithms · Computer Science 2022-07-15 Clément L. Canonne , Ilias Diakonikolas , Daniel M. Kane , Sihan Liu

We study the problem of testing \emph{conditional independence} for discrete distributions. Specifically, given samples from a discrete random variable $(X, Y, Z)$ on domain $[\ell_1]\times[\ell_2] \times [n]$, we want to distinguish, with…

Data Structures and Algorithms · Computer Science 2018-07-03 Clément L. Canonne , Ilias Diakonikolas , Daniel M. Kane , Alistair Stewart

We study the following distribution clustering problem: Given a hidden partition of $k$ distributions into two groups, such that the distributions within each group are the same, and the two distributions associated with the two clusters…

Data Structures and Algorithms · Computer Science 2025-12-10 Gunjan Kumar , Yash Pote , Jonathan Scarlett

We study a new framework for property testing of probability distributions, by considering distribution testing algorithms that have access to a conditional sampling oracle.* This is an oracle that takes as input a subset $S \subseteq [N]$…

Data Structures and Algorithms · Computer Science 2015-01-19 Clement Canonne , Dana Ron , Rocco A. Servedio

We study the identity testing problem for high-dimensional distributions. Given as input an explicit distribution $\mu$, an $\varepsilon>0$, and access to sampling oracle(s) for a hidden distribution $\pi$, the goal in identity testing is…

Data Structures and Algorithms · Computer Science 2024-09-02 Antonio Blanca , Zongchen Chen , Daniel Štefankovič , Eric Vigoda

We consider the problem of testing distribution identity. Given a sequence of independent samples from an unknown distribution on a domain of size n, the goal is to check if the unknown distribution approximately equals a known distribution…

Data Structures and Algorithms · Computer Science 2009-10-20 Krzysztof Onak

We give highly efficient algorithms, and almost matching lower bounds, for a range of basic statistical problems that involve testing and estimating the L_1 distance between two k-modal distributions $p$ and $q$ over the discrete domain…

Data Structures and Algorithms · Computer Science 2011-12-26 Constantinos Daskalakis , Ilias Diakonikolas , Rocco A. Servedio , Gregory Valiant , Paul Valiant

Given samples from two distributions over an $n$-element set, we wish to test whether these distributions are statistically close. We present an algorithm which uses sublinear in $n$, specifically, $O(n^{2/3}\epsilon^{-8/3}\log n)$,…

Data Structures and Algorithms · Computer Science 2010-11-05 Tugkan Batu , Lance Fortnow , Ronitt Rubinfeld , Warren D. Smith , Patrick White

We investigate the problem of identity testing for multidimensional histogram distributions. A distribution $p: D \rightarrow \mathbb{R}_+$, where $D \subseteq \mathbb{R}^d$, is called a $k$-histogram if there exists a partition of the…

Data Structures and Algorithms · Computer Science 2019-02-20 Ilias Diakonikolas , Daniel M. Kane , John Peebles
‹ Prev 1 2 3 10 Next ›