English
Related papers

Related papers: Mean Estimation from Coarse Data: Characterization…

200 papers

For many learning problems one may not have access to fine grained label information; e.g., an image can be labeled as husky, dog, or even animal depending on the expertise of the annotator. In this work, we formalize these settings and…

Machine Learning · Computer Science 2023-03-27 Dimitris Fotakis , Alkis Kalavasis , Vasilis Kontonis , Christos Tzamos

Estimation of the covariance matrix has attracted a lot of attention of the statistical research community over the years, partially due to important applications such as Principal Component Analysis. However, frequently used empirical…

Statistics Theory · Mathematics 2018-06-19 Stanislav Minsker

We study the algorithmic problem of robust mean estimation of an identity covariance Gaussian in the presence of mean-shift contamination. In this contamination model, we are given a set of points in $\mathbb{R}^d$ generated i.i.d. via the…

Data Structures and Algorithms · Computer Science 2025-02-21 Ilias Diakonikolas , Giannis Iakovidis , Daniel M. Kane , Thanasis Pittas

We study mean estimation for a Gaussian distribution with identity covariance in $\mathbb{R}^d$ under a missing data scheme termed realizable $\epsilon$-contamination model. In this model an adversary can choose a function $r(x)$ between 0…

Machine Learning · Computer Science 2026-03-18 Ilias Diakonikolas , Daniel M. Kane , Thanasis Pittas

We consider the problem of estimating the mean and covariance of a distribution from iid samples in $\mathbb{R}^n$, in the presence of an $\eta$ fraction of malicious noise; this is in contrast to much recent work where the noise itself is…

Data Structures and Algorithms · Computer Science 2016-08-16 Kevin A. Lai , Anup B. Rao , Santosh Vempala

Multivariate Gaussian is often used as a first approximation to the distribution of high-dimensional data. Determining the parameters of this distribution under various constraints is a widely studied problem in statistics, and is often…

Statistics Theory · Mathematics 2016-02-09 Samuel Balmand , Arnak Dalalyan

Common workflows in machine learning and statistics rely on the ability to partition the information in a data set into independent portions. Recent work has shown that this may be possible even when conventional sample splitting is not…

Methodology · Statistics 2025-12-16 Ameer Dharamshi , Anna Neufeld , Lucy L. Gao , Jacob Bien , Daniela Witten

We introduce a new method for estimating the mean of an outcome variable within groups when researchers only observe the average of the outcome and group indicators across a set of aggregation units, such as geographical areas. Existing…

Methodology · Statistics 2026-05-01 Cory McCartan , Shiro Kuriwaki

We consider computationally-efficient estimation of population parameters when observations are subject to missing data. In particular, we consider estimation under the realizable contamination model of missing data in which an $\epsilon$…

Statistics Theory · Mathematics 2026-03-18 Kabir Aladin Verchand , Ankit Pensia , Saminul Haque , Rohith Kuditipudi

We consider the problem of mean estimation assuming only finite variance. We study a new class of mean estimators constructed by integrating over random noise applied to a soft-truncated empirical mean estimator. For appropriate choices of…

Statistics Theory · Mathematics 2019-06-26 Matthew J. Holland

This paper focuses on the estimation of the sample covariance matrix from low-dimensional random projections of data known as compressive measurements. In particular, we present an unbiased estimator to extract the covariance structure from…

Machine Learning · Statistics 2017-05-01 Farhad Pourkamali-Anaraki

The goal of this paper is to show that a single robust estimator of the mean of a multivariate Gaussian distribution can enjoy five desirable properties. First, it is computationally tractable in the sense that it can be computed in a time…

Statistics Theory · Mathematics 2022-10-28 Arnak S. Dalalyan , Arshak Minasyan

Consider the problem of estimating the mean of a Gaussian random vector when the mean vector is assumed to be in a given convex set. The most natural solution is to take the Euclidean projection of the data vector on to this convex set; in…

Statistics Theory · Mathematics 2014-11-21 Sourav Chatterjee

Robust mean estimation is the problem of estimating the mean $\mu \in \mathbb{R}^d$ of a $d$-dimensional distribution $D$ from a list of independent samples, an $\epsilon$-fraction of which have been arbitrarily corrupted by a malicious…

Computational Complexity · Computer Science 2019-06-05 Samuel B. Hopkins , Jerry Li

We consider the classical problem of estimating the covariance matrix of a subgaussian distribution from i.i.d. samples in the novel context of coarse quantization, i.e., instead of having full knowledge of the samples, they are quantized…

Information Theory · Computer Science 2022-04-25 Sjoerd Dirksen , Johannes Maly , Holger Rauhut

We study polynomial time algorithms for estimating the mean of a heavy-tailed multivariate random vector. We assume only that the random vector $X$ has finite mean and covariance. In this setting, the radius of confidence intervals achieved…

Statistics Theory · Mathematics 2019-06-05 Samuel B. Hopkins

In many areas of science one aims to estimate latent sub-population mean curves based only on observations of aggregated population curves. By aggregated curves we mean linear combination of functional data that cannot be observed…

Methodology · Statistics 2011-02-15 Ronaldo Dias , Nancy L. Garcia , Alexandra M. Schmidt

Gaussian process regression is a powerful Bayesian nonlinear regression method. Recent research has enabled the capture of many types of observations using non-Gaussian likelihoods. To deal with various tasks in spatial modeling, we benefit…

Machine Learning · Statistics 2025-08-26 Yuta Shikuri

Partition-wise models offer a flexible approach for modeling complex and multidimensional data that are capable of producing interpretable results. They are based on partitioning the observed data into regions, each of which is modeled with…

Methodology · Statistics 2017-06-07 Rex C. Y. Cheung , Alexander Aue , Thomas C. M. Lee

We study the problem of estimating the parameters of a Gaussian distribution when samples are only shown if they fall in some (unknown) subset $S \subseteq \R^d$. This core problem in truncated statistics has long history going back to…

Statistics Theory · Mathematics 2019-08-06 Vasilis Kontonis , Christos Tzamos , Manolis Zampetakis
‹ Prev 1 2 3 10 Next ›