Related papers: Testing Distribution Identity Efficiently
We study the problem of testing identity against a given distribution with a focus on the high confidence regime. More precisely, given samples from an unknown distribution $p$ over $n$ elements, an explicitly given distribution $q$, and…
There has been considerable recent interest in distribution-tests whose run-time and sample requirements are sublinear in the domain-size $k$. We study two of the most important tests under the conditional-sampling model where each query…
One of the most fundamental problems in distribution testing is the identity testing problem: given samples $x_1,\ldots,x_s$, the goal is to determine whether the samples are drawn from a target distribution $\mathcal{D}$. When…
A recent model for property testing of probability distributions (Chakraborty et al., ITCS 2013, Canonne et al., SICOMP 2015) enables tremendous savings in the sample complexity of testing algorithms, by allowing them to condition the…
We are interested in testing properties of distributions with systematically mislabeled samples. Our goal is to make decisions about unknown probability distributions, using a sample that has been collected by a confused collector, such as…
We investigate the problem of identity testing for multidimensional histogram distributions. A distribution $p: D \rightarrow \mathbb{R}_+$, where $D \subseteq \mathbb{R}^d$, is called a $k$-histogram if there exists a partition of the…
In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in $\mathbb{R}^d$ with known covariance and product…
We study the question of identity testing for structured distributions. More precisely, given samples from a {\em structured} distribution $q$ over $[n]$ and an explicit distribution $p$ over $[n]$, we wish to distinguish whether $q=p$…
Determining whether an unknown distribution matches a known reference is a cornerstone problem in distributional analysis. While classical results establish a rigorous framework in the case of distributions over finite domains, real-world…
We study the problem of testing identity of a collection of unknown quantum states given sample access to this collection, each state appearing with some known probability. We show that for a collection of $d$-dimensional quantum states of…
In this work, we revisit the problem of uniformity testing of discrete probability distributions. A fundamental problem in distribution testing, testing uniformity over a known domain has been addressed over a significant line of works, and…
Distribution testing is a fundamental statistical task with many applications, but we are interested in a variety of problems where systematic mislabelings of the sample prevent us from applying the existing theory. To apply distribution…
Motivated by the question of data quantization and "binning," we revisit the problem of identity testing of discrete probability distributions. Identity testing (a.k.a. one-sample testing), a fundamental and by now well-understood problem…
Uniformity testing and the more general identity testing are well studied problems in distributional property testing. Most previous work focuses on testing under $L_1$-distance. However, when the support is very large or even continuous,…
We propose a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. Given samples from $s$ distributions, $p_1, p_2, \ldots, p_s$, we design testers for the…
We study the following distribution clustering problem: Given a hidden partition of $k$ distributions into two groups, such that the distributions within each group are the same, and the two distributions associated with the two clusters…
We study the identity testing problem for high-dimensional distributions. Given as input an explicit distribution $\mu$, an $\varepsilon>0$, and access to sampling oracle(s) for a hidden distribution $\pi$, the goal in identity testing is…
Equivalence testing, a fundamental problem in the field of distribution testing, seeks to infer if two unknown distributions on $[n]$ are the same or far apart in the total variation distance. Conditional sampling has emerged as a powerful…
We study the problems of identity and closeness testing of $n$-dimensional product distributions. Prior works by Canonne, Diakonikolas, Kane and Stewart (COLT 2017) and Daskalakis and Pan (COLT 2017) have established tight sample complexity…
In this work, we give a novel general approach for distribution testing. We describe two techniques: our first technique gives sample-optimal testers, while our second technique gives matching sample lower bounds. As a consequence, we…