Related papers: Efficient Truncated Statistics with Unknown Trunca…
We study the estimation of distributional parameters when samples are shown only if they fall in some unknown set $S \subseteq \mathbb{R}^d$. Kontonis, Tzamos, and Zampetakis (FOCS'19) gave a $d^{\mathrm{poly}(1/\varepsilon)}$ time…
We provide an efficient algorithm for the classical problem, going back to Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples…
In truncated linear regression, samples $(x,y)$ are shown only when the outcome $y$ falls inside a certain survival set $S^\star$ and the goal is to estimate the unknown $d$-dimensional regressor $w^\star$. This problem has a long history…
In this paper, we study high-dimensional estimation from truncated samples. We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models. (i) For…
This paper develops an analytical method of truncating inequality constrained Gaussian distributed variables where the constraints are themselves described by Gaussian distributions. Existing truncation methods either assume hard…
We study the problem of estimating the mean of an identity covariance Gaussian in the truncated setting, in the regime when the truncation set comes from a low-complexity family $\mathcal{C}$ of sets. Specifically, for a fixed but unknown…
We study the problem of estimating the parameters of a Boolean product distribution in $d$ dimensions, when the samples are truncated by a set $S \subset \{0, 1\}^d$ accessible through a membership oracle. This is the first time that the…
Estimators of parameters of truncated distributions, namely the truncated normal distribution, have been widely studied for a known truncation region. There is also literature for estimating the unknown bounds for known parent…
We study the basic statistical problem of testing whether normally distributed $n$-dimensional data has been truncated, i.e. altered by only retaining points that lie in some unknown truncation set $S \subseteq \mathbb{R}^n$. As our main…
Truncated linear regression is a classical challenge in Statistics, wherein a label, $y = w^T x + \varepsilon$, and its corresponding feature vector, $x \in \mathbb{R}^k$, are only observed if the label falls in some subset $S \subseteq…
Gaussian processes are a powerful framework for quantifying uncertainty and for sequential decision-making but are limited by the requirement of solving linear systems. In general, this has a cubic cost in dataset size and is sensitive to…
In the setting of entangled single-sample distributions, the goal is to estimate some common parameter shared by a family of $n$ distributions, given one single sample from each distribution. This paper studies mean estimation for entangled…
Motivated by a recent result of Daskalakis et al. 2018, we analyze the population version of Expectation-Maximization (EM) algorithm for the case of \textit{truncated} mixtures of two Gaussians. Truncated samples from a $d$-dimensional…
Common workflows in machine learning and statistics rely on the ability to partition the information in a data set into independent portions. Recent work has shown that this may be possible even when conventional sample splitting is not…
We present a Hamiltonian Monte Carlo algorithm to sample from multivariate Gaussian distributions in which the target space is constrained by linear and quadratic inequalities or products thereof. The Hamiltonian equations of motion can be…
We consider the problem of simulating a Gaussian vector X, conditional on the fact that each component of X belongs to a finite interval [a_i,b_i], or a semi-finite interval [a_i,+infty). In the one-dimensional case, we design a table-based…
Gaussian graphical models (GGMs) are widely used for statistical modeling, because of ease of inference and the ubiquitous use of the normal distribution in practical approximations. However, they are also known for their limited modeling…
Optimization of complex functions, such as the output of computer simulators, is a difficult task that has received much attention in the literature. A less studied problem is that of optimization under unknown constraints, i.e., when the…
Maximizing high-dimensional, non-convex functions through noisy observations is a notoriously hard problem, but one that arises in many applications. In this paper, we tackle this challenge by modeling the unknown function as a sample from…
We study a nonparametric Bayesian approach to linear inverse problems under discrete observations. We use the discrete Fourier transform to convert our model into a truncated Gaussian sequence model, that is closely related to the classical…