Related papers: Partial Identification under Missing Data Using We…

Identification, Doubly Robust Estimation, and Semiparametric Efficiency Theory of Nonignorable Missing Data With a Shadow Variable

We consider identification and estimation with an outcome missing not at random (MNAR). We study an identification strategy based on a so-called shadow variable. A shadow variable is assumed to be correlated with the outcome, but…

Methodology · Statistics 2019-09-10 Wang Miao , Lan Liu , Eric Tchetgen Tchetgen , Zhi Geng

Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable

Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. An outcome is said to be missing not at random (MNAR) if, conditional on the observed variables,…

Methodology · Statistics 2019-01-23 BaoLuo Sun , Lan Liu , Wang Miao , Kathleen Wirth , James Robins , Eric Tchetgen Tchetgen

Identifiable Deep Latent Variable Models for MNAR Data

Missing data is a ubiquitous challenge in data analysis, often leading to biased and inaccurate results. Traditional imputation methods usually assume that the missingness mechanism is missing-at-random (MAR), where the missingness is…

Methodology · Statistics 2026-03-30 Huiming Xie , Fei Xue , Xiao Wang

Partial Identifiability in Discrete Data With Measurement Error

When data contains measurement errors, it is necessary to make assumptions relating the observed, erroneous data to the unobserved true phenomena of interest. These assumptions should be justifiable on substantive grounds, but are often…

Machine Learning · Statistics 2020-12-24 Noam Finkelstein , Roy Adams , Suchi Saria , Ilya Shpitser

Sufficient Identification Conditions and Semiparametric Estimation under Missing Not at Random Mechanisms

Conducting valid statistical analyses is challenging in the presence of missing-not-at-random (MNAR) data, where the missingness mechanism is dependent on the missing values themselves even conditioned on the observed data. Here, we…

Methodology · Statistics 2023-06-13 Anna Guo , Jiwei Zhao , Razieh Nabi

Partial identification via conditional linear programs: estimation and policy learning

Many important quantities of interest are only partially identified from observable data: the data can limit them to a set of plausible values, but not uniquely determine them. This paper develops a unified framework for covariate-assisted…

Methodology · Statistics 2025-08-15 Eli Ben-Michael

Identification and Estimation for Nonignorable Missing Data: A Data Fusion Approach

We consider the task of identifying and estimating a parameter of interest in settings where data is missing not at random (MNAR). In general, such parameters are not identified without strong assumptions on the missing data model. In this…

Methodology · Statistics 2024-02-29 Zixiao Wang , AmirEmad Ghassami , Ilya Shpitser

A Review of Statistical Methods for Handling Nonignorable Missing Data using Instrument Approach

Nonignorable missing data, where the probability of missingness depends on unobserved values, presents a significant challenge in statistical analysis. Traditional methods often rely on strong parametric assumptions that are difficult to…

Methodology · Statistics 2025-09-19 Yujie Zhao

Graphical Models for Processing Missing Data

This paper reviews recent advances in missing data research using graphical models to represent multivariate dependencies. We first examine the limitations of traditional frameworks from three different perspectives: \textit{transparency,…

Methodology · Statistics 2019-11-15 Karthika Mohan , Judea Pearl

Partial identification for discrete data with nonignorable missing outcomes

Nonignorable missing outcomes are common in real world datasets and often require strong parametric assumptions to achieve identification. These assumptions can be implausible or untestable, and so we may forgo them in favour of partially…

Methodology · Statistics 2023-10-19 Daniel Daly-Grafstein , Paul Gustafson

Solving Non-identifiable Latent Feature Models

Latent feature models (LFM)s are widely employed for extracting latent structures of data. While offering high, parameter estimation is difficult with LFMs because of the combinational nature of latent features, and non-identifiability is a…

Machine Learning · Computer Science 2018-09-27 Ryota Suzuki , Shingo Takahashi , Murtuza Petladwala , Shigeru Kohmoto

High-dimensional estimation with missing data: Statistical and computational limits

We consider computationally-efficient estimation of population parameters when observations are subject to missing data. In particular, we consider estimation under the realizable contamination model of missing data in which an $\epsilon$…

Statistics Theory · Mathematics 2026-03-18 Kabir Aladin Verchand , Ankit Pensia , Saminul Haque , Rohith Kuditipudi

Efficient Nonparametric Inference for Mediation Analysis with Nonignorable Missing Confounders

Mediation analysis is widely used for exploring treatment mechanisms; however, it faces challenges when nonignorable missing confounders are present. Efficient inference of mediation effects and the efficiency loss due to nonignorable…

Methodology · Statistics 2026-04-22 Jiawei Shan , Wei Li , Chunrong Ai

Semiparametric Inference for Partially Identifiable Data Fusion Estimands via Double Machine Learning

Many statistical estimands of interest (e.g., in regression or causality) are functions of the joint distribution of multiple random variables. But in some applications, data is not available that measures all random variables on each…

Methodology · Statistics 2025-02-11 Yicong Jiang , Lucas Janson

Semiparametric efficiency in GMM models with auxiliary data

We study semiparametric efficiency bounds and efficient estimation of parameters defined through general moment restrictions with missing data. Identification relies on auxiliary data containing information about the distribution of the…

Statistics Theory · Mathematics 2008-04-04 Xiaohong Chen , Han Hong , Alessandro Tarozzi

An integrated approach to test for missing not at random

Missing data can lead to inefficiencies and biases in analyses, in particular when data are missing not at random (MNAR). It is thus vital to understand and correctly identify the missing data mechanism. Recovering missing values through a…

Methodology · Statistics 2022-12-08 Jack Noonan , Adetola Adedamola Adediran , Robin Mitra , Stefanie Biedermann

Evaluation of Missing Data Analytical Techniques in Longitudinal Research: Traditional and Machine Learning Approaches

Missing Not at Random (MNAR) and nonnormal data are challenging to handle. Traditional missing data analytical techniques such as full information maximum likelihood estimation (FIML) may fail with nonnormal data as they are built on normal…

Applications · Statistics 2024-06-21 Dandan Tang , Xin Tong

Modeling Latent Variable Uncertainty for Loss-based Learning

We consider the problem of parameter estimation using weakly supervised datasets, where a training sample consists of the input and a partially specified annotation, which we refer to as the output. The missing information in the annotation…

Machine Learning · Computer Science 2012-06-22 M. Pawan Kumar , Ben Packer , Daphne Koller

Full-semiparametric-likelihood-based inference for non-ignorable missing data

During the past few decades, missing-data problems have been studied extensively, with a focus on the ignorable missing case, where the missing probability depends only on observable quantities. By contrast, research into non-ignorable…

Methodology · Statistics 2019-08-06 Yukun Liu , Pengfei Li , Jing Qin

A Unified Framework for Inference with General Missingness Patterns and Machine Learning Imputation

Pre-trained machine learning (ML) predictions have been increasingly used to complement incomplete data to enable downstream scientific inquiries, but their naive integration risks biased inferences. Recently, multiple methods have been…

Methodology · Statistics 2025-11-12 Xingran Chen , Tyler McCormick , Bhramar Mukherjee , Zhenke Wu