Related papers: DiffPuter: Empowering Diffusion Models for Missing…

DiffEM: Learning from Corrupted Data with Diffusion Models via Expectation Maximization

Diffusion models have emerged as powerful generative priors for high-dimensional inverse problems, yet learning them when only corrupted or noisy observations are available remains challenging. In this work, we propose a new method for…

Machine Learning · Computer Science 2025-12-23 Danial Hosseintabar , Fan Chen , Giannis Daras , Antonio Torralba , Constantinos Daskalakis

DiffImpute: Tabular Data Imputation With Denoising Diffusion Probabilistic Model

Tabular data plays a crucial role in various domains but often suffers from missing values, thereby curtailing its potential utility. Traditional imputation techniques frequently yield suboptimal results and impose substantial computational…

Machine Learning · Computer Science 2024-03-22 Yizhu Wen , Kai Yi , Jing Ke , Yiqing Shen

Diffusion Models for Tabular Data Imputation and Synthetic Data Generation

Data imputation and data generation have important applications for many domains, like healthcare and finance, where incomplete or missing data can hinder accurate analysis and decision-making. Diffusion models have emerged as powerful…

Machine Learning · Computer Science 2025-06-10 Mario Villaizán-Vallelado , Matteo Salvatori , Carlos Segura , Ioannis Arapakis

Diffusion Transformers for Imputation: Statistical Efficiency and Uncertainty Quantification

Imputation methods play a critical role in enhancing the quality of practical time-series data, which often suffer from pervasive missing values. Recently, diffusion-based generative imputation methods have demonstrated remarkable success…

Machine Learning · Computer Science 2025-10-03 Zeqi Ye , Minshuo Chen

Latent Diffusion for Missing Data

Diffusion models have emerged as powerful generative approaches for missing-data imputation, yet most existing methods operate directly in data space and degrade when training data are heavily incomplete. We investigate whether shifting…

Machine Learning · Computer Science 2026-05-28 Alberte Heering Estad , Ignacio Peis , Jes Frellsen

Improving Missing Data Imputation with Deep Generative Models

Datasets with missing values are very common on industry applications, and they can have a negative impact on machine learning models. Recent studies introduced solutions to the problem of imputing missing values based on deep generative…

Machine Learning · Computer Science 2019-02-28 Ramiro D. Camino , Christian A. Hammerschmidt , Radu State

MissDDIM: Deterministic and Efficient Conditional Diffusion for Tabular Data Imputation

Diffusion models have recently emerged as powerful tools for missing data imputation by modeling the joint distribution of observed and unobserved variables. However, existing methods, typically based on stochastic denoising diffusion…

Artificial Intelligence · Computer Science 2025-08-06 Youran Zhou , Mohamed Reda Bouadjenek , Sunil Aryal

Diffusion models for missing value imputation in tabular data

Missing value imputation in machine learning is the task of estimating the missing values in the dataset accurately using available information. In this task, several deep generative modeling methods have been proposed and demonstrated…

Machine Learning · Computer Science 2023-03-14 Shuhan Zheng , Nontawat Charoenphakdee

An Expectation-Maximization Algorithm for Training Clean Diffusion Models from Corrupted Observations

Diffusion models excel in solving imaging inverse problems due to their ability to model complex image priors. However, their reliance on large, clean datasets for training limits their practical use where clean data is scarce. In this…

Computer Vision and Pattern Recognition · Computer Science 2024-07-02 Weimin Bai , Yifei Wang , Wenzheng Chen , He Sun

Conditional expectation with regularization for missing data imputation

Missing data frequently occurs in datasets across various domains, such as medicine, sports, and finance. In many cases, to enable proper and reliable analyses of such data, the missing values are often imputed, and it is necessary that the…

Machine Learning · Statistics 2023-09-12 Mai Anh Vu , Thu Nguyen , Tu T. Do , Nhan Phan , Nitesh V. Chawla , Pål Halvorsen , Michael A. Riegler , Binh T. Nguyen

Generative Modeling and Data Augmentation for Power System Production Simulation

As a key component of power system production simulation, load forecasting is critical for the stable operation of power systems. Machine learning methods prevail in this field. However, the limited training data can be a challenge. This…

Systems and Control · Electrical Eng. & Systems 2024-12-18 Linna Xu , Yongli Zhu

DPER: Efficient Parameter Estimation for Randomly Missing Data

The missing data problem has been broadly studied in the last few decades and has various applications in different areas such as statistics or bioinformatics. Even though many methods have been developed to tackle this challenge, most of…

Machine Learning · Statistics 2021-06-10 Thu Nguyen , Khoi Minh Nguyen-Duy , Duy Ho Minh Nguyen , Binh T. Nguyen , Bruce Alan Wade

Generative Imputation and Stochastic Prediction

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is…

Machine Learning · Computer Science 2020-09-07 Mohammad Kachuee , Kimmo Karkkainen , Orpaz Goldstein , Sajad Darabi , Majid Sarrafzadeh

DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

Existing text-video retrieval solutions are, in essence, discriminant models focused on maximizing the conditional likelihood, i.e., p(candidates|query). While straightforward, this de facto paradigm overlooks the underlying data…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Peng Jin , Hao Li , Zesen Cheng , Kehan Li , Xiangyang Ji , Chang Liu , Li Yuan , Jie Chen

Towards a methodology for addressing missingness in datasets, with an application to demographic health datasets

Missing data is a common concern in health datasets, and its impact on good decision-making processes is well documented. Our study's contribution is a methodology for tackling missing data problems using a combination of synthetic dataset…

Machine Learning · Computer Science 2022-11-08 Gift Khangamwa , Terence L. van Zyl , Clint J. van Alten

Proposition of a Theoretical Model for Missing Data Imputation using Deep Learning and Evolutionary Algorithms

In the last couple of decades, there has been major advancements in the domain of missing data imputation. The techniques in the domain include amongst others: Expectation Maximization, Neural Networks with Evolutionary Algorithms or…

Neural and Evolutionary Computing · Computer Science 2015-12-07 Collins Leke , Tshilidzi Marwala , Satyakama Paul

HyperImpute: Generalized Iterative Imputation with Automatic Model Selection

Consider the problem of imputing missing values in a dataset. One the one hand, conventional approaches using iterative imputation benefit from the simplicity and customizability of learning conditional distributions directly, but suffer…

Machine Learning · Statistics 2022-06-17 Daniel Jarrett , Bogdan Cebere , Tennison Liu , Alicia Curth , Mihaela van der Schaar

The Missing U for Efficient Diffusion Models

Diffusion Probabilistic Models stand as a critical tool in generative modelling, enabling the generation of complex data distributions. This family of generative models yields record-breaking performance in tasks such as image synthesis,…

Machine Learning · Computer Science 2024-04-08 Sergio Calvo-Ordonez , Chun-Wun Cheng , Jiahao Huang , Lipei Zhang , Guang Yang , Carola-Bibiane Schonlieb , Angelica I Aviles-Rivero

Impugan: Learning Conditional Generative Models for Robust Data Imputation

Incomplete data are common in real-world applications. Sensors fail, records are inconsistent, and datasets collected from different sources often differ in scale, sampling rate, and quality. These differences create missing values that…

Machine Learning · Computer Science 2025-12-08 Zalish Mahmud , Anantaa Kotal , Aritran Piplai

MissDiff: Training Diffusion Models on Tabular Data with Missing Values

The diffusion model has shown remarkable performance in modeling data distributions and synthesizing data. However, the vanilla diffusion model requires complete or fully observed data for training. Incomplete data is a common issue in…

Machine Learning · Computer Science 2023-07-04 Yidong Ouyang , Liyan Xie , Chongxuan Li , Guang Cheng