Related papers: Sample-based high-dimensional convexity testing
We consider the problem of testing whether an unknown and arbitrary set $S \subseteq \mathbb{R}^n$ (given as a black-box membership oracle) is convex, versus $\varepsilon$-far from every convex set, under the standard Gaussian distribution.…
We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models. In the first model, we are given $n$ i.i.d. samples from the distribution…
We study the basic statistical problem of testing whether normally distributed $n$-dimensional data has been truncated, i.e. altered by only retaining points that lie in some unknown truncation set $S \subseteq \mathbb{R}^n$. As our main…
We revisit the problem of property testing for convex position for point sets in $\mathbb{R}^d$. Our results draw from previous ideas of Czumaj, Sohler, and Ziegler (ESA 2000). First, the algorithm is redesigned and its analysis is revised…
We study the problems of testing and learning high-dimensional discrete convex sets. The simplest high-dimensional discrete domain where convexity is a non-trivial property is the ternary hypercube, $\{-1,0,1\}^n$. The goal of this work is…
In this work, we analyze an efficient sampling-based algorithm for general-purpose reachability analysis, which remains a notoriously challenging problem with applications ranging from neural network verification to safety analysis of…
In this work, we study the sample complexity of two variants of product testing when restricted to single-copy measurements. In particular, we consider both bipartite product testing (i.e., does there exist at least one non-trivial cut…
Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…
In this work, we develop new insights into the fundamental problem of convexity testing of real-valued functions over the domain $[n]$. Specifically, we present a nonadaptive algorithm that, given inputs $\eps \in (0,1), s \in \mathbb{N}$,…
We study the problem of testing the covariance matrix of a high-dimensional Gaussian in a robust setting, where the input distribution has been corrupted in Huber's contamination model. Specifically, we are given i.i.d. samples from a…
We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property $\mathcal{P}$, and parameters $0< \epsilon, \delta <1$, we…
We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution $p$, extensive research has established optimal bounds for uniformity testing,…
We study monotonicity testing of functions $f \colon \{0,1\}^d \to \{0,1\}$ using sample-based algorithms, which are only allowed to observe the value of $f$ on points drawn independently from the uniform distribution. A classic result by…
In this paper we consider the uniformity testing problem for high-dimensional discrete distributions (multinomials) under sparse alternatives. More precisely, we derive sharp detection thresholds for testing, based on $n$ samples, whether a…
The two-sample problem, which consists in testing whether independent samples on $\mathbb{R}^d$ are drawn from the same (unknown) distribution, finds applications in many areas. Its study in high-dimension is the subject of much attention,…
In this paper we study the problem of testing of constrained samplers over high-dimensional distributions with $(\varepsilon,\eta,\delta)$ guarantees. Samplers are increasingly used in a wide range of safety-critical ML applications, and…
We propose novel methodology for testing equality of model parameters between two high-dimensional populations. The technique is very general and applicable to a wide range of models. The method is based on sample splitting: the data is…
Sample weighting is widely used in deep learning. A large number of weighting methods essentially utilize the learning difficulty of training samples to calculate their weights. In this study, this scheme is called difficulty-based…
Samplers are the backbone of the implementations of any randomised algorithm. Unfortunately, obtaining an efficient algorithm to test the correctness of samplers is very hard to find. Recently, in a series of works, testers like…
Equivalence testing, a fundamental problem in the field of distribution testing, seeks to infer if two unknown distributions on $[n]$ are the same or far apart in the total variation distance. Conditional sampling has emerged as a powerful…