English
Related papers

Related papers: Estimation beyond Missing (Completely) at Random

200 papers

We consider computationally-efficient estimation of population parameters when observations are subject to missing data. In particular, we consider estimation under the realizable contamination model of missing data in which an $\epsilon$…

Statistics Theory · Mathematics 2026-03-18 Kabir Aladin Verchand , Ankit Pensia , Saminul Haque , Rohith Kuditipudi

We study mean estimation for a Gaussian distribution with identity covariance in $\mathbb{R}^d$ under a missing data scheme termed realizable $\epsilon$-contamination model. In this model an adversary can choose a function $r(x)$ between 0…

Machine Learning · Computer Science 2026-03-18 Ilias Diakonikolas , Daniel M. Kane , Thanasis Pittas

Standard methods for estimating average causal effects require complete observations of the exposure and confounders. In observational studies, however, missing data are ubiquitous. Motivated by a study on the effect of prescription opioids…

Methodology · Statistics 2025-06-30 Lan Wen , Glen McGee

Given a set of incomplete observations, we study the nonparametric problem of testing whether data are Missing Completely At Random (MCAR). Our first contribution is to characterise precisely the set of alternatives that can be…

Statistics Theory · Mathematics 2022-05-19 Thomas B Berrett , Richard J Samworth

In the missing data literature, the Maximum Likelihood Estimator (MLE) is celebrated for its ignorability property under missing at random (MAR) data. However, its sensitivity to misspecification of the (complete) data model, even under…

Methodology · Statistics 2025-09-23 Badr-Eddine Chérief-Abdellatif , Jeffrey Näf

Missing Not At Random (MNAR) values lead to significant biases in the data, since the probability of missingness depends on the unobserved values.They are ''not ignorable'' in the sense that they often require defining a model for the…

Statistics Theory · Mathematics 2020-06-11 Aude Sportisse , Claire Boyer , Julie Josse

Conducting valid statistical analyses is challenging in the presence of missing-not-at-random (MNAR) data, where the missingness mechanism is dependent on the missing values themselves even conditioned on the observed data. Here, we…

Methodology · Statistics 2023-06-13 Anna Guo , Jiwei Zhao , Razieh Nabi

The analysis of randomized trials is often complicated by the occurrence of intercurrent events and missing values. Even though there are different strategies to address missing values it is still common to require missing values…

Methodology · Statistics 2025-11-11 A. Ruiz de Villa , Ll. Badiella

Conditions ensuring optimal parameter estimation in the presence of missing data are well established in inference, typically relying on the Missing-at-Random (MAR) assumption. In prediction, similar principles are often assumed to apply.…

Methodology · Statistics 2026-03-19 Pierre Catoire , Robin Genuer , Cecile Proust-Lima

Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...). In fact, the very nature of missing values usually…

Machine Learning · Statistics 2022-02-04 Alexis Ayme , Claire Boyer , Aymeric Dieuleveut , Erwan Scornet

Constant (naive) imputation is still widely used in practice as this is a first easy-to-use technique to deal with missing data. Yet, this simple method could be expected to induce a large bias for prediction purposes, as the imputed input…

Statistics Theory · Mathematics 2024-02-07 Alexis Ayme , Claire Boyer , Aymeric Dieuleveut , Erwan Scornet

Missing values are ubiquitous in (data) science, with potential detrimental consequences for any statistical analysis. As a consequence, a wealth of methods and theoretical results have been developed in recent years. Still, many questions…

Statistics Theory · Mathematics 2026-03-25 Badr-Eddine Chérief-Abdellatif , Jeffrey Näf

Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. An outcome is said to be missing not at random (MNAR) if, conditional on the observed variables,…

Methodology · Statistics 2019-01-23 BaoLuo Sun , Lan Liu , Wang Miao , Kathleen Wirth , James Robins , Eric Tchetgen Tchetgen

Missing data is a pervasive challenge spanning diverse data types, including tabular, sensor data, time-series, images and so on. Its origins are multifaceted, resulting in various missing mechanisms. Prior research in this field has…

Machine Learning · Computer Science 2025-03-03 Youran Zhou , Mohamed Reda Bouadjenek , Sunil Aryal

Practical problems with missing data are common, and statistical methods have been developed concerning the validity and/or efficiency of statistical procedures. On a central focus, there have been longstanding interests on the mechanism…

Methodology · Statistics 2020-03-26 Rui Duan , C. Jason Liang , Pamela Shaw , Cheng Yong Tang , Yong Chen

We consider an empirical likelihood inference for parameters defined by general estimating equations when some components of the random observations are subject to missingness. As the nature of the estimating equations is wide-ranging, we…

Statistics Theory · Mathematics 2009-03-05 Dong Wang , Song Xi Chen

Missing data is a ubiquitous challenge in data analysis, often leading to biased and inaccurate results. Traditional imputation methods usually assume that the missingness mechanism is missing-at-random (MAR), where the missingness is…

Methodology · Statistics 2026-03-30 Huiming Xie , Fei Xue , Xiao Wang

Matrix completion is often applied to data with entries missing not at random (MNAR). For example, consider a recommendation system where users tend to only reveal ratings for items they like. In this case, a matrix completion method that…

Machine Learning · Statistics 2019-10-30 Wei Ma , George H. Chen

Noncompliance and missing data often occur in randomized trials, which complicate the inference of causal effects. When both noncompliance and missing data are present, previous papers proposed moment and maximum likelihood estimators for…

Methodology · Statistics 2014-09-04 Hua Chen , Peng Ding , Zhi Geng , Xiao-Hua Zhou

Pre-trained machine learning (ML) predictions have been increasingly used to complement incomplete data to enable downstream scientific inquiries, but their naive integration risks biased inferences. Recently, multiple methods have been…

Methodology · Statistics 2025-11-12 Xingran Chen , Tyler McCormick , Bhramar Mukherjee , Zhenke Wu
‹ Prev 1 2 3 10 Next ›