Related papers: Private High-Dimensional Hypothesis Testing
In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in $\mathbb{R}^d$ with known covariance and product…
We present a fast, differentially private algorithm for high-dimensional covariance-aware mean estimation with nearly optimal sample complexity. Only exponential-time estimators were previously known to achieve this guarantee. Given $n$…
In this work, we give efficient algorithms for privately estimating a Gaussian distribution in both pure and approximate differential privacy (DP) models with optimal dependence on the dimension in the sample complexity. In the pure DP…
We present differentially private algorithms for high-dimensional mean estimation. Previous private estimators on distributions over $\mathbb{R}^d$ suffer from a curse of dimensionality, as they require $\Omega(d^{1/2})$ samples to achieve…
Given a dataset of $n$ i.i.d. samples from an unknown distribution $P$, we consider the problem of generating a sample from a distribution that is close to $P$ in total variation distance, under the constraint of differential privacy (DP).…
We present two sample-efficient differentially private mean estimators for $d$-dimensional (sub)Gaussian distributions with unknown covariance. Informally, given $n \gtrsim d/\alpha^2$ samples from such a distribution with mean $\mu$ and…
The increasing prevalence of high-dimensional data across various applications has raised significant privacy concerns in statistical inference. In this paper, we propose a differentially private integrated statistic for testing…
We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution $p$, extensive research has established optimal bounds for uniformity testing,…
We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in…
We provide the first $\widetilde{\mathcal{O}}\left(d\right)$-sample algorithm for sampling from unbounded Gaussian distributions under the constraint of $\left(\varepsilon, \delta\right)$-differential privacy. This is a quadratic…
We study the fundamental problems of identity testing (goodness of fit), and closeness testing (two sample test) of distributions over $k$ elements, under differential privacy. While the problems have a long history in statistics, finite…
We investigate the problems of identity and closeness testing over a discrete population from random samples. Our goal is to develop efficient testers while guaranteeing Differential Privacy to the individuals of the population. We describe…
We provide a differentially private algorithm for hypothesis selection. Given samples from an unknown probability distribution $P$ and a set of $m$ probability distributions $\mathcal{H}$, the goal is to output, in a…
We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property $\mathcal{P}$, and parameters $0< \epsilon, \delta <1$, we…
We develop differentially private hypothesis testing methods for the small sample regime. Given a sample $\cal D$ from a categorical distribution $p$ over some domain $\Sigma$, an explicitly described distribution $q$ over $\Sigma$, some…
We present new algorithms for estimating and testing \emph{collision probability}, a fundamental measure of the spread of a discrete distribution that is widely used in many scientific fields. We describe an algorithm that satisfies…
We provide optimal lower bounds for two well-known parameter estimation (also known as statistical estimation) tasks in high dimensions with approximate differential privacy. First, we prove that for any $\alpha \le O(1)$, estimating the…
We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. This captures scenarios where a data analyst must perform hypothesis testing tasks…
We study the following fundamental hypothesis testing problem, which we term Gaussian mean testing. Given i.i.d. samples from a distribution $p$ on $\mathbb{R}^d$, the task is to distinguish, with high probability, between the following…
We consider the task of privately obtaining prediction error guarantees in ordinary least-squares regression problems with Gaussian covariates (with unknown covariance structure). We provide the first sample-optimal polynomial time…