Related papers: Anaconda: A Non-Adaptive Conditional Sampling Algo…

Equivalence Testing: The Power of Bounded Adaptivity

Equivalence testing, a fundamental problem in the field of distribution testing, seeks to infer if two unknown distributions on $[n]$ are the same or far apart in the total variation distance. Conditional sampling has emerged as a powerful…

Data Structures and Algorithms · Computer Science 2024-03-08 Diptarka Chakraborty , Sourav Chakraborty , Gunjan Kumar , Kuldeep S. Meel

Testing probability distributions using conditional samples

We study a new framework for property testing of probability distributions, by considering distribution testing algorithms that have access to a conditional sampling oracle.* This is an oracle that takes as input a subset $S \subseteq [N]$…

Data Structures and Algorithms · Computer Science 2015-01-19 Clement Canonne , Dana Ron , Rocco A. Servedio

A Chasm Between Identity and Equivalence Testing with Conditional Queries

A recent model for property testing of probability distributions (Chakraborty et al., ITCS 2013, Canonne et al., SICOMP 2015) enables tremendous savings in the sample complexity of testing algorithms, by allowing them to condition the…

Data Structures and Algorithms · Computer Science 2018-12-10 Jayadev Acharya , Clément L. Canonne , Gautam Kamath

On Distribution Testing in the Conditional Sampling Model

Recently, there has been significant work studying distribution testing under the Conditional Sampling model. In this model, a query specifies a subset $S$ of the domain, and the output received is a sample drawn from the distribution…

Data Structures and Algorithms · Computer Science 2020-11-05 Shyam Narayanan

Optimal Algorithms for Augmented Testing of Discrete Distributions

We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution $p$, extensive research has established optimal bounds for uniformity testing,…

Machine Learning · Computer Science 2024-12-03 Maryam Aliakbarpour , Piotr Indyk , Ronitt Rubinfeld , Sandeep Silwal

Property Testing of Joint Distributions using Conditional Samples

In this paper, we consider the problem of testing properties of joint distributions under the Conditional Sampling framework. In the standard sampling model, the sample complexity of testing properties of joint distributions is exponential…

Computational Complexity · Computer Science 2022-08-03 Rishiraj Bhattacharyya , Sourav Chakraborty

Tight Lower Bound on Equivalence Testing in Conditional Sampling Model

We study the equivalence testing problem where the goal is to determine if the given two unknown distributions on $[n]$ are equal or $\epsilon$-far in the total variation distance in the conditional sampling model (CFGM, SICOMP16; CRS,…

Data Structures and Algorithms · Computer Science 2023-08-23 Diptarka Chakraborty , Sourav Chakraborty , Gunjan Kumar

Faster Algorithms for Testing under Conditional Sampling

There has been considerable recent interest in distribution-tests whose run-time and sample requirements are sublinear in the domain-size $k$. We study two of the most important tests under the conditional-sampling model where each query…

Data Structures and Algorithms · Computer Science 2015-04-17 Moein Falahatgar , Ashkan Jafarpour , Alon Orlitsky , Venkatadheeraj Pichapathi , Ananda Theertha Suresh

Improving and extending the testing of distributions for shape-restricted properties

Distribution testing deals with what information can be deduced about an unknown distribution over $\{1,\ldots,n\}$, where the algorithm is only allowed to obtain a relatively small number of independent samples from the distribution. In…

Computational Complexity · Computer Science 2016-09-23 Eldar Fischer , Oded Lachish , Yadu Vasudev

Testing probability distributions underlying aggregated data

In this paper, we analyze and study a hybrid model for testing and learning probability distributions. Here, in addition to samples, the testing algorithm is provided with one of two different types of oracles to the unknown distribution…

Data Structures and Algorithms · Computer Science 2014-02-18 Clément Canonne , Ronitt Rubinfeld

Faster Sublinear Algorithms using Conditional Sampling

A conditional sampling oracle for a probability distribution D returns samples from the conditional distribution of D restricted to a specified subset of the domain. A recent line of work (Chakraborty et al. 2013 and Cannone et al. 2014)…

Data Structures and Algorithms · Computer Science 2016-08-18 Themistoklis Gouleakis , Christos Tzamos , Manolis Zampetakis

The Sample Complexity of Search over Multiple Populations

This paper studies the sample complexity of searching over multiple populations. We consider a large number of populations, each corresponding to either distribution P0 or P1. The goal of the search problem studied here is to find one…

Information Theory · Computer Science 2016-11-17 Matthew L. Malloy , Gongguo Tang , Robert D. Nowak

A New Approach for Testing Properties of Discrete Distributions

In this work, we give a novel general approach for distribution testing. We describe two techniques: our first technique gives sample-optimal testers, while our second technique gives matching sample lower bounds. As a consequence, we…

Data Structures and Algorithms · Computer Science 2016-05-10 Ilias Diakonikolas , Daniel M. Kane

Algorithmic Stability for Adaptive Data Analysis

Adaptivity is an important feature of data analysis---the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model,…

Machine Learning · Computer Science 2015-11-10 Raef Bassily , Kobbi Nissim , Adam Smith , Thomas Steinke , Uri Stemmer , Jonathan Ullman

Distribution Testing with a Confused Collector

We are interested in testing properties of distributions with systematically mislabeled samples. Our goal is to make decisions about unknown probability distributions, using a sample that has been collected by a confused collector, such as…

Data Structures and Algorithms · Computer Science 2023-11-27 Renato Ferreira Pinto , Nathaniel Harms

Asymptotically Optimal Matching of Multiple Sequences to Source Distributions and Training Sequences

Consider a finite set of sources, each producing i.i.d. observations that follow a unique probability distribution on a finite alphabet. We study the problem of matching a finite set of observed sequences to the set of sources under the…

Information Theory · Computer Science 2014-12-09 Jayakrishnan Unnikrishnan

Testing properties of distributions in the streaming model

We study distribution testing in the standard access model and the conditional access model when the memory available to the testing algorithm is bounded. In both scenarios, the samples appear in an online fashion and the goal is to test…

Data Structures and Algorithms · Computer Science 2023-09-08 Sampriti Roy , Yadu Vasudev

Replicable Distribution Testing

We initiate a systematic investigation of distribution testing in the framework of algorithmic replicability. Specifically, given independent samples from a collection of probability distributions, the goal is to characterize the sample…

Machine Learning · Computer Science 2025-07-04 Ilias Diakonikolas , Jingyi Gao , Daniel Kane , Sihan Liu , Christopher Ye

Testing with Non-identically Distributed Samples

We examine the extent to which sublinear-sample property testing and estimation apply to settings where samples are independently but not identically distributed. Specifically, we consider the following distributional property testing…

Data Structures and Algorithms · Computer Science 2025-11-05 Shivam Garg , Chirag Pabbaraju , Kirankumar Shiragur , Gregory Valiant

Near-Optimal Parallel Approximate Counting via Sampling

The computational equivalence between approximate counting and sampling is well established for polynomial-time algorithms. The most efficient general reduction from counting to sampling is achieved via simulated annealing, where the…

Data Structures and Algorithms · Computer Science 2026-04-03 David G. Harris , Vladimir Kolmogorov , Hongyang Liu , Yitong Yin , Yiyao Zhang