Related papers: Generative Conditional Missing Imputation Networks

Extended Missing Data Imputation via GANs for Ranking Applications

We propose Conditional Imputation GAN, an extended missing data imputation method based on Generative Adversarial Networks (GANs). The motivating use case is learning-to-rank, the cornerstone of modern search, recommendation system, and…

Machine Learning · Statistics 2021-11-11 Grace Deng , Cuize Han , David S. Matteson

Multiple Imputation via Generative Adversarial Network for High-dimensional Blockwise Missing Value Problems

Missing data are present in most real world problems and need careful handling to preserve the prediction accuracy and statistical consistency in the downstream analysis. As the gold standard of handling missing data, multiple imputation…

Machine Learning · Computer Science 2021-12-23 Zongyu Dai , Zhiqi Bu , Qi Long

Deep Generative Imputation Model for Missing Not At Random Data

Data analysis usually suffers from the Missing Not At Random (MNAR) problem, where the cause of the value missing is not fully observed. Compared to the naive Missing Completely At Random (MCAR) problem, it is more in line with the…

Machine Learning · Computer Science 2025-05-27 Jialei Chen , Yuanbo Xu , Pengyang Wang , Yongjian Yang

Multiple imputation with missing data indicators

Multiple imputation is a well-established general technique for analyzing data with missing values. A convenient way to implement multiple imputation is sequential regression multiple imputation (SRMI), also called chained equations…

Methodology · Statistics 2021-03-04 Lauren J Beesley , Irina Bondarenko , Michael R Elliott , Allison W Kurian , Steven J Katz , Jeremy M G Taylor

Imputation of Missing Data with Class Imbalance using Conditional Generative Adversarial Networks

Missing data is a common problem faced with real-world datasets. Imputation is a widely used technique to estimate the missing data. State-of-the-art imputation approaches, such as Generative Adversarial Imputation Nets (GAIN), model the…

Machine Learning · Computer Science 2020-12-02 Saqib Ejaz Awan , Mohammed Bennamoun , Ferdous Sohel , Frank M Sanfilippo , Girish Dwivedi

Generative Imputation and Stochastic Prediction

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is…

Machine Learning · Computer Science 2020-09-07 Mohammad Kachuee , Kimmo Karkkainen , Orpaz Goldstein , Sajad Darabi , Majid Sarrafzadeh

FragmGAN: Generative Adversarial Nets for Fragmentary Data Imputation and Prediction

Modern scientific research and applications very often encounter "fragmentary data" which brings big challenges to imputation and prediction. By leveraging the structure of response patterns, we propose a unified and flexible framework…

Machine Learning · Computer Science 2022-03-10 Fang Fang , Shenliao Bao

What Is a Good Imputation Under MAR Missingness?

Missing values pose a persistent challenge in modern data science. Consequently, there is an ever-growing number of publications introducing new imputation methods in various fields. The present paper attempts to take a step back and…

Statistics Theory · Mathematics 2026-01-21 Jeffrey Näf , Erwan Scornet , Julie Josse

A Unified Framework for Inference with General Missingness Patterns and Machine Learning Imputation

Pre-trained machine learning (ML) predictions have been increasingly used to complement incomplete data to enable downstream scientific inquiries, but their naive integration risks biased inferences. Recently, multiple methods have been…

Methodology · Statistics 2025-11-12 Xingran Chen , Tyler McCormick , Bhramar Mukherjee , Zhenke Wu

Random features models: a way to study the success of naive imputation

Constant (naive) imputation is still widely used in practice as this is a first easy-to-use technique to deal with missing data. Yet, this simple method could be expected to induce a large bias for prediction purposes, as the imputed input…

Statistics Theory · Mathematics 2024-02-07 Alexis Ayme , Claire Boyer , Aymeric Dieuleveut , Erwan Scornet

FCMI: Feature Correlation based Missing Data Imputation

Processed data are insightful, and crude data are obtuse. A serious threat to data reliability is missing values. Such data leads to inaccurate analysis and wrong predictions. We propose an efficient technique to impute the missing value in…

Machine Learning · Computer Science 2021-07-02 Prateek Mishra , Kumar Divya Mani , Prashant Johri , Dikhsa Arya

Identifiable Generative Models for Missing Not at Random Data Imputation

Real-world datasets often have missing values associated with complex generative processes, where the cause of the missingness may not be fully observed. This is known as missing not at random (MNAR) data. However, many imputation methods…

Machine Learning · Computer Science 2021-10-29 Chao Ma , Cheng Zhang

Improving Missing Data Imputation with Deep Generative Models

Datasets with missing values are very common on industry applications, and they can have a negative impact on machine learning models. Recent studies introduced solutions to the problem of imputing missing values based on deep generative…

Machine Learning · Computer Science 2019-02-28 Ramiro D. Camino , Christian A. Hammerschmidt , Radu State

CFMI: Flow Matching for Missing Data Imputation

We introduce conditional flow matching for imputation (CFMI), a new general-purpose method to impute missing data. The method combines continuous normalising flows, flow-matching, and shared conditional modelling to deal with…

Machine Learning · Computer Science 2025-06-12 Vaidotas Simkus , Michael U. Gutmann

HyperImpute: Generalized Iterative Imputation with Automatic Model Selection

Consider the problem of imputing missing values in a dataset. One the one hand, conventional approaches using iterative imputation benefit from the simplicity and customizability of learning conditional distributions directly, but suffer…

Machine Learning · Statistics 2022-06-17 Daniel Jarrett , Bogdan Cebere , Tennison Liu , Alicia Curth , Mihaela van der Schaar

Probabilistic Imputation for Time-series Classification with Missing Data

Multivariate time series data for real-world applications typically contain a significant amount of missing values. The dominant approach for classification with such missing values is to impute them heuristically with specific values…

Machine Learning · Computer Science 2023-08-15 SeungHyun Kim , Hyunsu Kim , EungGu Yun , Hwangrae Lee , Jaehun Lee , Juho Lee

Rethinking GNNs and Missing Features: Challenges, Evaluation and a Robust Solution

Handling missing node features is a key challenge for deploying Graph Neural Networks (GNNs) in real-world domains such as healthcare and sensor networks. Existing studies mostly address relatively benign scenarios, namely benchmark…

Machine Learning · Computer Science 2026-05-19 Francesco Ferrini , Veronica Lachi , Antonio Longa , Bruno Lepri , Matono Akiyoshi , Andrea Passerini , Xin Liu , Manfred Jaeger

Estimating conditional density of missing values using deep Gaussian mixture model

We consider the problem of estimating the conditional probability distribution of missing values given the observed ones. We propose an approach, which combines the flexibility of deep neural networks with the simplicity of Gaussian mixture…

Machine Learning · Computer Science 2020-11-20 Marcin Przewięźlikowski , Marek Śmieja , Łukasz Struski

Missing Value Imputation Based on Deep Generative Models

Missing values widely exist in many real-world datasets, which hinders the performing of advanced data analytics. Properly filling these missing values is crucial but challenging, especially when the missing rate is high. Many approaches…

Machine Learning · Computer Science 2018-08-07 Hongbao Zhang , Pengtao Xie , Eric Xing

Recursive Equations For Imputation Of Missing Not At Random Data With Sparse Pattern Support

A common approach for handling missing values in data analysis pipelines is multiple imputation via software packages such as MICE (Van Buuren and Groothuis-Oudshoorn, 2011) and Amelia (Honaker et al., 2011). These packages typically assume…

Methodology · Statistics 2025-07-23 Trung Phung , Kyle Reese , Ilya Shpitser , Rohit Bhattacharya