Related papers: Deep Generative Pattern-Set Mixture Models for Non…

Clustering Data with Nonignorable Missingness using Semi-Parametric Mixture Models

We are concerned in clustering continuous data sets subject to non-ignorable missingness. We perform clustering with a specific semi-parametric mixture, under the assumption of conditional independence given the component. The mixture model…

Methodology · Statistics 2021-07-20 Marie Du Roy de Chaumaray , Matthieu Marbac

Sequential identification of nonignorable missing data mechanisms

With nonignorable missing data, likelihood-based inference should be based on the joint distribution of the study variables and their missingness indicators. These joint models cannot be estimated from the data alone, thus requiring the…

Statistics Theory · Mathematics 2017-01-06 Mauricio Sadinle , Jerome P. Reiter

Identifiable Deep Latent Variable Models for MNAR Data

Missing data is a ubiquitous challenge in data analysis, often leading to biased and inaccurate results. Traditional imputation methods usually assume that the missingness mechanism is missing-at-random (MAR), where the missingness is…

Methodology · Statistics 2026-03-30 Huiming Xie , Fei Xue , Xiao Wang

Sequentially additive nonignorable missing data modeling using auxiliary marginal information

We study a class of missingness mechanisms, called sequentially additive nonignorable, for modeling multivariate data with item nonresponse. These mechanisms explicitly allow the probability of nonresponse for each variable to depend on the…

Methodology · Statistics 2019-02-19 Mauricio Sadinle , Jerome P. Reiter

Nonparametric Pattern-Mixture Models for Inference with Missing Data

Pattern-mixture models provide a transparent approach for handling missing data, where the full-data distribution is factorized in a way that explicitly shows the parts that can be estimated from observed data alone, and the parts that…

Methodology · Statistics 2019-04-26 Yen-Chi Chen , Mauricio Sadinle

A Latent Variable Framework for Multiple Imputation with Non-ignorable Missingness: Analyzing Perceptions of Social Justice in Europe

This paper proposes a general multiple imputation approach for analyzing large-scale data with missing values. An imputation model is derived from a joint distribution induced by a latent variable model, which can flexibly capture…

Methodology · Statistics 2025-09-26 Siliang Zhang , Yunxiao Chen , Jouni Kuha

Identifiability of Normal and Normal Mixture Models With Nonignorable Missing Data

Missing data problems arise in many applied research studies. They may jeopardize statistical inference of the model of interest, if the missing mechanism is nonignorable, that is, the missing mechanism depends on the missing values…

Statistics Theory · Mathematics 2015-09-15 Wang Miao , Peng Ding , Zhi Geng

Learning from missing data with the Latent Block Model

Missing data can be informative. Ignoring this information can lead to misleading conclusions when the data model does not allow information to be extracted from the missing data. We propose a co-clustering model, based on the Latent Block…

Machine Learning · Computer Science 2020-10-26 Gabriel Frisch , Jean-Benoist Léger , Yves Grandvalet

Identifiable Generative Models for Missing Not at Random Data Imputation

Real-world datasets often have missing values associated with complex generative processes, where the cause of the missingness may not be fully observed. This is known as missing not at random (MNAR) data. However, many imputation methods…

Machine Learning · Computer Science 2021-10-29 Chao Ma , Cheng Zhang

Deep Unsupervised Clustering Using Mixture of Autoencoders

Unsupervised clustering is one of the most fundamental challenges in machine learning. A popular hypothesis is that data are generated from a union of low-dimensional nonlinear manifolds; thus an approach to clustering is identifying and…

Machine Learning · Computer Science 2017-12-27 Dejiao Zhang , Yifan Sun , Brian Eriksson , Laura Balzano

Improving Missing Data Imputation with Deep Generative Models

Datasets with missing values are very common on industry applications, and they can have a negative impact on machine learning models. Recent studies introduced solutions to the problem of imputing missing values based on deep generative…

Machine Learning · Computer Science 2019-02-28 Ramiro D. Camino , Christian A. Hammerschmidt , Radu State

Proposition of a Theoretical Model for Missing Data Imputation using Deep Learning and Evolutionary Algorithms

In the last couple of decades, there has been major advancements in the domain of missing data imputation. The techniques in the domain include amongst others: Expectation Maximization, Neural Networks with Evolutionary Algorithms or…

Neural and Evolutionary Computing · Computer Science 2015-12-07 Collins Leke , Tshilidzi Marwala , Satyakama Paul

Model-based Clustering with Missing Not At Random Data

Model-based unsupervised learning, as any learning task, stalls as soon as missing data occurs. This is even more true when the missing data are informative, or said missing not at random (MNAR). In this paper, we propose model-based…

Machine Learning · Statistics 2023-12-25 Aude Sportisse , Matthieu Marbac , Fabien Laporte , Gilles Celeux , Claire Boyer , Julie Josse , Christophe Biernacki

Autoencoders and Probabilistic Inference with Missing Data: An Exact Solution for The Factor Analysis Case

Latent variable models can be used to probabilistically "fill-in" missing data entries. The variational autoencoder architecture (Kingma and Welling, 2014; Rezende et al., 2014) includes a "recognition" or "encoder" network that infers the…

Machine Learning · Computer Science 2019-02-20 Christopher K. I. Williams , Charlie Nash , Alfredo Nazábal

not-MIWAE: Deep Generative Modelling with Missing not at Random Data

When a missing process depends on the missing values themselves, it needs to be explicitly modelled and taken into account while doing likelihood-based inference. We present an approach for building and fitting deep latent variable models…

Machine Learning · Statistics 2021-03-19 Niels Bruun Ipsen , Pierre-Alexandre Mattei , Jes Frellsen

MIDA: Multiple Imputation using Denoising Autoencoders

Missing data is a significant problem impacting all domains. State-of-the-art framework for minimizing missing data bias is multiple imputation, for which the choice of an imputation model remains nontrivial. We propose a multiple…

Machine Learning · Computer Science 2018-02-20 Lovedeep Gondara , Ke Wang

Imputation of Missing Data Using Linear Gaussian Cluster-Weighted Modeling

Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data…

Methodology · Statistics 2021-10-26 Luis Alejandro Masmela-Caita , Thais Paiva Galletti , Marcos Oliveira Prates

Gaussian Copula Models for Nonignorable Missing Data Using Auxiliary Marginal Quantiles

We present an approach for modeling and imputation of nonignorable missing data. Our approach uses Bayesian data integration to combine (1) a Gaussian copula model for all study variables and missingness indicators, which allows arbitrary…

Methodology · Statistics 2024-11-19 Joseph Feldman , Jerome P. Reiter , Daniel R. Kowal

Discrete Choice Models for Nonmonotone Nonignorable Missing Data: Identification and Inference

Nonmonotone missing data arise routinely in empirical studies of social and health sciences, and when ignored, can induce selection bias and loss of efficiency. In practice, it is common to account for nonresponse under a missing-at-random…

Methodology · Statistics 2017-07-20 Eric J. Tchetgen Tchetgen , Linbo Wang , BaoLuo Sun

MissDeepCausal: Causal Inference from Incomplete Data Using Deep Latent Variable Models

Inferring causal effects of a treatment, intervention or policy from observational data is central to many applications. However, state-of-the-art methods for causal inference seldom consider the possibility that covariates have missing…

Methodology · Statistics 2020-02-26 Imke Mayer , Julie Josse , Félix Raimundo , Jean-Philippe Vert