Related papers: Random Indicator Imputation for Missing Not At Ran…

Multiple imputation with missing data indicators

Multiple imputation is a well-established general technique for analyzing data with missing values. A convenient way to implement multiple imputation is sequential regression multiple imputation (SRMI), also called chained equations…

Methodology · Statistics 2021-03-04 Lauren J Beesley , Irina Bondarenko , Michael R Elliott , Allison W Kurian , Steven J Katz , Jeremy M G Taylor

Imputation and Missing Indicators for handling missing data in the development and implementation of clinical prediction models: a simulation study

Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the…

Methodology · Statistics 2022-06-27 Rose Sisk , Matthew Sperrin , Niels Peek , Maarten van Smeden , Glen P. Martin

Identifiable Deep Latent Variable Models for MNAR Data

Missing data is a ubiquitous challenge in data analysis, often leading to biased and inaccurate results. Traditional imputation methods usually assume that the missingness mechanism is missing-at-random (MAR), where the missingness is…

Methodology · Statistics 2026-03-30 Huiming Xie , Fei Xue , Xiao Wang

Statistical Inference with Different Missing-data Mechanisms

When data are missing due to at most one cause from some time to next time, we can make sampling distribution inferences about the parameter of the data by modeling the missing-data mechanism correctly. Proverbially, in case its mechanism…

Methodology · Statistics 2014-07-21 Kosuke Morikawa , Yutaka Kano

Estimation and imputation in Probabilistic Principal Component Analysis with Missing Not At Random data

Missing Not At Random (MNAR) values lead to significant biases in the data, since the probability of missingness depends on the unobserved values.They are ''not ignorable'' in the sense that they often require defining a model for the…

Statistics Theory · Mathematics 2020-06-11 Aude Sportisse , Claire Boyer , Julie Josse

Deep Generative Imputation Model for Missing Not At Random Data

Data analysis usually suffers from the Missing Not At Random (MNAR) problem, where the cause of the value missing is not fully observed. Compared to the naive Missing Completely At Random (MCAR) problem, it is more in line with the…

Machine Learning · Computer Science 2025-05-27 Jialei Chen , Yuanbo Xu , Pengyang Wang , Yongjian Yang

Recursive Equations For Imputation Of Missing Not At Random Data With Sparse Pattern Support

A common approach for handling missing values in data analysis pipelines is multiple imputation via software packages such as MICE (Van Buuren and Groothuis-Oudshoorn, 2011) and Amelia (Honaker et al., 2011). These packages typically assume…

Methodology · Statistics 2025-07-23 Trung Phung , Kyle Reese , Ilya Shpitser , Rohit Bhattacharya

Imputation Scores

Given the prevalence of missing data in modern statistical research, a broad range of methods is available for any given imputation task. How does one choose the `best' imputation method in a given application? The standard approach is to…

Applications · Statistics 2022-12-01 Jeffrey Näf , Meta-Lina Spohn , Loris Michel , Nicolai Meinshausen

Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable

Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. An outcome is said to be missing not at random (MNAR) if, conditional on the observed variables,…

Methodology · Statistics 2019-01-23 BaoLuo Sun , Lan Liu , Wang Miao , Kathleen Wirth , James Robins , Eric Tchetgen Tchetgen

An Investigation of Methods for Handling Missing Data with Penalized Regression

We investigate methods for penalized regression in the presence of missing observations. This paper introduces a method for estimating the parameters which compensates for the missing observations. We first, derive an unbiased estimator of…

Applications · Statistics 2013-10-09 Yunjin Choi , Robert Tibshirani

What Is a Good Imputation Under MAR Missingness?

Missing values pose a persistent challenge in modern data science. Consequently, there is an ever-growing number of publications introducing new imputation methods in various fields. The present paper attempts to take a step back and…

Statistics Theory · Mathematics 2026-01-21 Jeffrey Näf , Erwan Scornet , Julie Josse

An integrated approach to test for missing not at random

Missing data can lead to inefficiencies and biases in analyses, in particular when data are missing not at random (MNAR). It is thus vital to understand and correctly identify the missing data mechanism. Recovering missing values through a…

Methodology · Statistics 2022-12-08 Jack Noonan , Adetola Adedamola Adediran , Robin Mitra , Stefanie Biedermann

Imputation and low-rank estimation with Missing Not At Random data

Missing values challenge data analysis because many supervised and unsupervised learning methods cannot be applied directly to incomplete data. Matrix completion based on low-rank assumptions are very powerful solution for dealing with…

Machine Learning · Statistics 2020-01-30 Aude Sportisse , Claire Boyer , Julie Josse

Identifiable Generative Models for Missing Not at Random Data Imputation

Real-world datasets often have missing values associated with complex generative processes, where the cause of the missingness may not be fully observed. This is known as missing not at random (MNAR) data. However, many imputation methods…

Machine Learning · Computer Science 2021-10-29 Chao Ma , Cheng Zhang

A fresh look at ignorability for likelihood inference

When data are incomplete, a random vector Y for the data process together with a binary random vector R for the process that causes missing data, are modelled jointly. We review conditions under which R can be ignored for drawing likelihood…

Methodology · Statistics 2019-04-01 John C Galati

Missing Not at Random in Matrix Completion: The Effectiveness of Estimating Missingness Probabilities Under a Low Nuclear Norm Assumption

Matrix completion is often applied to data with entries missing not at random (MNAR). For example, consider a recommendation system where users tend to only reveal ratings for items they like. In this case, a matrix completion method that…

Machine Learning · Statistics 2019-10-30 Wei Ma , George H. Chen

Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data

Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample…

Machine Learning · Computer Science 2022-12-23 Zongyu Dai , Zhiqi Bu , Qi Long

Sufficient Identification Conditions and Semiparametric Estimation under Missing Not at Random Mechanisms

Conducting valid statistical analyses is challenging in the presence of missing-not-at-random (MNAR) data, where the missingness mechanism is dependent on the missing values themselves even conditioned on the observed data. Here, we…

Methodology · Statistics 2023-06-13 Anna Guo , Jiwei Zhao , Razieh Nabi

Developing robust methods to handle missing data in real-world applications effectively

Missing data is a pervasive challenge spanning diverse data types, including tabular, sensor data, time-series, images and so on. Its origins are multifaceted, resulting in various missing mechanisms. Prior research in this field has…

Machine Learning · Computer Science 2025-03-03 Youran Zhou , Mohamed Reda Bouadjenek , Sunil Aryal

Multiple Imputation for Non-Monotone Missing Not at Random Binary Data using the No Self-Censoring Model

Although approaches for handling missing data from longitudinal studies are well-developed when the patterns of missingness are monotone, fewer methods are available for non-monotone missingness. Moreover, the conventional missing at random…

Methodology · Statistics 2023-02-28 Boyu Ren , Stuart R. Lipsitz , Roger D. Weiss , Garrett M. Fitzmaurice