Related papers: DiffImpute: Tabular Data Imputation With Denoising…

MissDDIM: Deterministic and Efficient Conditional Diffusion for Tabular Data Imputation

Diffusion models have recently emerged as powerful tools for missing data imputation by modeling the joint distribution of observed and unobserved variables. However, existing methods, typically based on stochastic denoising diffusion…

Artificial Intelligence · Computer Science 2025-08-06 Youran Zhou , Mohamed Reda Bouadjenek , Sunil Aryal

DiffPuter: Empowering Diffusion Models for Missing Data Imputation

Generative models play an important role in missing data imputation in that they aim to learn the joint distribution of full data. However, applying advanced deep generative models (such as Diffusion models) to missing data imputation is…

Machine Learning · Computer Science 2025-05-27 Hengrui Zhang , Liancheng Fang , Qitian Wu , Philip S. Yu

MissHDD: Hybrid Deterministic Diffusion for Hetrogeneous Incomplete Data Imputation

Incomplete data are common in real-world tabular applications, where numerical, categorical, and discrete attributes coexist within a single dataset. This heterogeneous structure presents significant challenges for existing diffusion-based…

Machine Learning · Computer Science 2025-11-19 Youran Zhou , Mohamed Reda Bouadjenek , Sunil Aryal

TabImpute: Universal Zero-Shot Imputation for Tabular Data

Missing data is a widespread problem in tabular settings. Existing solutions range from simple averaging to complex generative adversarial networks, but due to each method's large variance in performance across real-world domains and…

Machine Learning · Computer Science 2026-02-18 Jacob Feitelberg , Dwaipayan Saha , Kyuseong Choi , Zaid Ahmad , Anish Agarwal , Raaz Dwivedi

TabDDPM: Modelling Tabular Data with Diffusion Models

Denoising diffusion probabilistic models are currently becoming the leading paradigm of generative modeling for many important data modalities. Being the most prevalent in the computer vision community, diffusion models have also recently…

Machine Learning · Computer Science 2024-10-08 Akim Kotelnikov , Dmitry Baranchuk , Ivan Rubachev , Artem Babenko

Self-Supervision Improves Diffusion Models for Tabular Data Imputation

The ubiquity of missing data has sparked considerable attention and focus on tabular data imputation methods. Diffusion models, recognized as the cutting-edge technique for data generation, demonstrate significant potential in tabular data…

Machine Learning · Computer Science 2024-07-26 Yixin Liu , Thalaiyasingam Ajanthan , Hisham Husain , Vu Nguyen

FedTabDiff: Federated Learning of Diffusion Probabilistic Models for Synthetic Mixed-Type Tabular Data Generation

Realistic synthetic tabular data generation encounters significant challenges in preserving privacy, especially when dealing with sensitive information in domains like finance and healthcare. In this paper, we introduce \textit{Federated…

Machine Learning · Computer Science 2024-01-15 Timur Sattarov , Marco Schreyer , Damian Borth

Extending Tabular Denoising Diffusion Probabilistic Models for Time-Series Data Generation

Diffusion models are increasingly being utilised to create synthetic tabular and time series data for privacy-preserving augmentation. Tabular Denoising Diffusion Probabilistic Models (TabDDPM) generate high-quality synthetic data from…

Machine Learning · Computer Science 2026-04-08 Umang Dobhal , Christina Garcia , Sozo Inoue

TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation

Synthesizing high-quality tabular data is an important topic in many data science tasks, ranging from dataset augmentation to privacy protection. However, developing expressive generative models for tabular data is challenging due to its…

Machine Learning · Computer Science 2025-02-18 Juntong Shi , Minkai Xu , Harper Hua , Hengrui Zhang , Stefano Ermon , Jure Leskovec

Diffusion Models for Tabular Data Imputation and Synthetic Data Generation

Data imputation and data generation have important applications for many domains, like healthcare and finance, where incomplete or missing data can hinder accurate analysis and decision-making. Diffusion models have emerged as powerful…

Machine Learning · Computer Science 2025-06-10 Mario Villaizán-Vallelado , Matteo Salvatori , Carlos Segura , Ioannis Arapakis

Diffusion models for missing value imputation in tabular data

Missing value imputation in machine learning is the task of estimating the missing values in the dataset accurately using available information. In this task, several deep generative modeling methods have been proposed and demonstrated…

Machine Learning · Computer Science 2023-03-14 Shuhan Zheng , Nontawat Charoenphakdee

Not Another Imputation Method: A Transformer-based Model for Missing Values in Tabular Datasets

Handling missing values in tabular datasets presents a significant challenge in training and testing artificial intelligence models, an issue usually addressed using imputation techniques. Here we introduce "Not Another Imputation Method"…

Machine Learning · Computer Science 2026-03-13 Camillo Maria Caruso , Paolo Soda , Valerio Guarrasi

TabINR: An Implicit Neural Representation Framework for Tabular Data Imputation

Tabular data builds the basis for a wide range of applications, yet real-world datasets are frequently incomplete due to collection errors, privacy restrictions, or sensor failures. As missing values degrade the performance or hinder the…

Machine Learning · Computer Science 2025-10-02 Vincent Ochs , Florentin Bieder , Sidaty el Hadramy , Paul Friedrich , Stephanie Taha-Mehlitz , Anas Taha , Philippe C. Cattin

Diffusion-Driven Synthetic Tabular Data Generation for Enhanced DoS/DDoS Attack Classification

Class imbalance refers to a situation where certain classes in a dataset have significantly fewer samples than oth- ers, leading to biased model performance. Class imbalance in network intrusion detection using Tabular Denoising Diffusion…

Cryptography and Security · Computer Science 2026-02-02 Aravind B , Anirud R. S. , Sai Surya Teja N , Bala Subrahmanya Sriranga Navaneeth A , Karthika R , Mohankumar N

DeepIFSAC: Deep Imputation of Missing Values Using Feature and Sample Attention within Contrastive Framework

Missing values of varying patterns and rates in real-world tabular data pose a significant challenge in developing reliable data-driven models. The most commonly used statistical and machine learning methods for missing value imputation may…

Machine Learning · Computer Science 2025-03-26 Ibna Kowsar , Shourav B. Rabbani , Yina Hou , Manar D. Samad

Bring the Power of Diffusion Model to Defect Detection

Due to the high complexity and technical requirements of industrial production processes, surface defects will inevitably appear, which seriously affects the quality of products. Although existing lightweight detection networks are highly…

Computer Vision and Pattern Recognition · Computer Science 2024-08-27 Xuyi Yu

Rethinking the Diffusion Models for Numerical Tabular Data Imputation from the Perspective of Wasserstein Gradient Flow

Diffusion models (DMs) have gained attention in Missing Data Imputation (MDI), but there remain two long-neglected issues to be addressed: (1). Inaccurate Imputation, which arises from inherently sample-diversification-pursuing generative…

Machine Learning · Computer Science 2024-06-25 Zhichao Chen , Haoxuan Li , Fangyikang Wang , Odin Zhang , Hu Xu , Xiaoyu Jiang , Zhihuan Song , Eric H. Wang

Conditional expectation with regularization for missing data imputation

Missing data frequently occurs in datasets across various domains, such as medicine, sports, and finance. In many cases, to enable proper and reliable analyses of such data, the missing values are often imputed, and it is necessary that the…

Machine Learning · Statistics 2023-09-12 Mai Anh Vu , Thu Nguyen , Tu T. Do , Nhan Phan , Nitesh V. Chawla , Pål Halvorsen , Michael A. Riegler , Binh T. Nguyen

Impute-MACFM: Imputation based on Mask-Aware Flow Matching

Tabular data are central to many applications, especially longitudinal data in healthcare, where missing values are common, undermining model fidelity and reliability. Prior imputation methods either impose restrictive assumptions or…

Machine Learning · Computer Science 2025-09-30 Dengyi Liu , Honggang Wang , Hua Fang

I-Diff: Structural Regularization for High-Fidelity Diffusion Models

Denoising Diffusion Probabilistic Models (DDPMs) have significantly advanced generative AI, achieving impressive results in high-quality image and data generation. However, enhancing fidelity without compromising semantic content remains a…

Machine Learning · Computer Science 2025-12-17 Shakthi Perera , Dilum Fernando , H. L. P. Malshan , H. M. P. S. Madushan , Roshan Godaliyadda , M. P. B. Ekanayake , Dhananjaya Jayasundara , Roshan Ragel