Related papers: Testing Convex Truncation
We study the problem of estimating the parameters of a Gaussian distribution when samples are only shown if they fall in some (unknown) subset $S \subseteq \R^d$. This core problem in truncated statistics has long history going back to…
We characterize the fundamental limits of high-dimensional mean testing under arbitrary truncation, where samples are drawn from the conditional distribution $P(\cdot \mid S)$ for an unknown truncation set $S$ that may hide up to an…
In the problem of high-dimensional convexity testing, there is an unknown set $S \subseteq \mathbb{R}^n$ which is promised to be either convex or $\varepsilon$-far from every convex body with respect to the standard multivariate normal…
We provide an efficient algorithm for the classical problem, going back to Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples…
Testing for association or dependence between pairs of random variables is a fundamental problem in statistics. In some applications, data are subject to selection bias that causes dependence between observations even when it is absent from…
We develop a systematic, omnibus approach to goodness-of-fit testing for parametric distributional models when the variable of interest is only partially observed due to censoring and/or truncation. In many such designs, tests based on the…
We study the problem of estimating the parameters of a Boolean product distribution in $d$ dimensions, when the samples are truncated by a set $S \subset \{0, 1\}^d$ accessible through a membership oracle. This is the first time that the…
We consider the basic statistical problem of detecting truncation of the uniform distribution on the Boolean hypercube by juntas. More concretely, we give upper and lower bounds on the problem of distinguishing between i.i.d. sample access…
We investigate distribution testing with access to non-adaptive conditional samples. In the conditional sampling model, the algorithm is given the following access to a distribution: it submits a query set $S$ to an oracle, which returns a…
Distribution testing deals with what information can be deduced about an unknown distribution over $\{1,\ldots,n\}$, where the algorithm is only allowed to obtain a relatively small number of independent samples from the distribution. In…
We consider the task of Gaussian mean testing, that is, of testing whether a high-dimensional vector perturbed by white noise has large magnitude, or is the zero vector. This question, originating from the signal processing community, has…
In clinical and epidemiological research doubly truncated data often appear. This is the case, for instance, when the data registry is formed by interval sampling. Double truncation generally induces a sampling bias on the target variable,…
We study the convergence rate of randomly truncated stochastic algorithms, which consist in the truncation of the standard Robbins-Monro procedure on an increasing sequence of compact sets. Such a truncation is often required in practice to…
We study the convergence rate of randomly truncated stochastic algorithms, which consist in the truncation of the standard Robbins-Monro procedure on an increasing sequence of compact sets. Such a truncation is often required in practice to…
The finite sensitivity of instruments or detection methods means that data sets in many areas of astronomy, for example cosmological or exoplanet surveys, are necessarily systematically incomplete. Such data sets, where the population being…
We introduce a fast and easy-to-implement simulation algorithm for a multivariate normal distribution truncated on the intersection of a set of hyperplanes, and further generalize it to efficiently simulate random variables from a…
As in standard linear regression, in truncated linear regression, we are given access to observations $(A_i, y_i)_i$ whose dependent variable equals $y_i= A_i^{\rm T} \cdot x^* + \eta_i$, where $x^*$ is some fixed unknown vector of interest…
The framework of distribution testing is currently ubiquitous in the field of property testing. In this model, the input is a probability distribution accessible via independently drawn samples from an oracle. The testing task is to…
We consider the problem of testing whether an unknown and arbitrary set $S \subseteq \mathbb{R}^n$ (given as a black-box membership oracle) is convex, versus $\varepsilon$-far from every convex set, under the standard Gaussian distribution.…
In this work, we revisit the problem of uniformity testing of discrete probability distributions. A fundamental problem in distribution testing, testing uniformity over a known domain has been addressed over a significant line of works, and…