Related papers: Self-Supervision Improves Diffusion Models for Tab…

MissDDIM: Deterministic and Efficient Conditional Diffusion for Tabular Data Imputation

Diffusion models have recently emerged as powerful tools for missing data imputation by modeling the joint distribution of observed and unobserved variables. However, existing methods, typically based on stochastic denoising diffusion…

Artificial Intelligence · Computer Science 2025-08-06 Youran Zhou , Mohamed Reda Bouadjenek , Sunil Aryal

Diffusion Models for Tabular Data Imputation and Synthetic Data Generation

Data imputation and data generation have important applications for many domains, like healthcare and finance, where incomplete or missing data can hinder accurate analysis and decision-making. Diffusion models have emerged as powerful…

Machine Learning · Computer Science 2025-06-10 Mario Villaizán-Vallelado , Matteo Salvatori , Carlos Segura , Ioannis Arapakis

DiffImpute: Tabular Data Imputation With Denoising Diffusion Probabilistic Model

Tabular data plays a crucial role in various domains but often suffers from missing values, thereby curtailing its potential utility. Traditional imputation techniques frequently yield suboptimal results and impose substantial computational…

Machine Learning · Computer Science 2024-03-22 Yizhu Wen , Kai Yi , Jing Ke , Yiqing Shen

TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation

Synthesizing high-quality tabular data is an important topic in many data science tasks, ranging from dataset augmentation to privacy protection. However, developing expressive generative models for tabular data is challenging due to its…

Machine Learning · Computer Science 2025-02-18 Juntong Shi , Minkai Xu , Harper Hua , Hengrui Zhang , Stefano Ermon , Jure Leskovec

TabDDPM: Modelling Tabular Data with Diffusion Models

Denoising diffusion probabilistic models are currently becoming the leading paradigm of generative modeling for many important data modalities. Being the most prevalent in the computer vision community, diffusion models have also recently…

Machine Learning · Computer Science 2024-10-08 Akim Kotelnikov , Dmitry Baranchuk , Ivan Rubachev , Artem Babenko

ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection

Anomaly detection in multivariate time series data is of paramount importance for ensuring the efficient operation of large-scale systems across diverse domains. However, accurately detecting anomalies in such data poses significant…

Machine Learning · Computer Science 2023-11-15 Yuhang Chen , Chaoyun Zhang , Minghua Ma , Yudong Liu , Ruomeng Ding , Bowen Li , Shilin He , Saravan Rajmohan , Qingwei Lin , Dongmei Zhang

Extending Tabular Denoising Diffusion Probabilistic Models for Time-Series Data Generation

Diffusion models are increasingly being utilised to create synthetic tabular and time series data for privacy-preserving augmentation. Tabular Denoising Diffusion Probabilistic Models (TabDDPM) generate high-quality synthetic data from…

Machine Learning · Computer Science 2026-04-08 Umang Dobhal , Christina Garcia , Sozo Inoue

SlimDiffSR: Toward Lightweight and Efficient Remote Sensing Image Super-Resolution via Diffusion Model Distillation

Diffusion models have recently achieved remarkable performance in image super-resolution (SR), but their high computational cost limits practical deployment in remote sensing applications. To address this issue, we propose SlimDiffSR, a…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Ce Wang , Zhenyu Hu , Wanjie Sun

Diffusion models for missing value imputation in tabular data

Missing value imputation in machine learning is the task of estimating the missing values in the dataset accurately using available information. In this task, several deep generative modeling methods have been proposed and demonstrated…

Machine Learning · Computer Science 2023-03-14 Shuhan Zheng , Nontawat Charoenphakdee

TabADM: Unsupervised Tabular Anomaly Detection with Diffusion Models

Tables are an abundant form of data with use cases across all scientific fields. Real-world datasets often contain anomalous samples that can negatively affect downstream analysis. In this work, we only assume access to contaminated data…

Machine Learning · Computer Science 2023-07-25 Guy Zamberg , Moshe Salhov , Ofir Lindenbaum , Amir Averbuch

SimDiff: Simpler Yet Better Diffusion Model for Time Series Point Forecasting

Diffusion models have recently shown promise in time series forecasting, particularly for probabilistic predictions. However, they often fail to achieve state-of-the-art point estimation performance compared to regression-based methods.…

Artificial Intelligence · Computer Science 2025-11-25 Hang Ding , Xue Wang , Tian Zhou , Tao Yao

Diffusion-Based Data Augmentation for Image Recognition: A Systematic Analysis and Evaluation

Diffusion-based data augmentation (DiffDA) has emerged as a promising approach to improving classification performance under data scarcity. However, existing works vary significantly in task configurations, model choices, and experimental…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Zekun Li , Yinghuan Shi , Yang Gao , Dong Xu

Diffuse to Detect: Bi-Level Sample Rebalancing with Pseudo-Label Diffusion for Point-Supervised Infrared Small-Target Detection

Point supervision has become a scalable solution to address dense annotation for infrared small target detection, but its performance is limited by two coupled bottlenecks: unstable pseudo-label evolution in cluttered, low-contrast infrared…

Computer Vision and Pattern Recognition · Computer Science 2026-05-21 Zhu Liu , Yuanhang Yao , Ping Qian , Zihang Chen , Risheng Liu

sasdim: self-adaptive noise scaling diffusion model for spatial time series imputation

Spatial time series imputation is critically important to many real applications such as intelligent transportation and air quality monitoring. Although recent transformer and diffusion model based approaches have achieved significant…

Machine Learning · Computer Science 2023-09-06 Shunyang Zhang , Senzhang Wang , Xianzhen Tan , Ruochen Liu , Jian Zhang , Jianxin Wang

MissHDD: Hybrid Deterministic Diffusion for Hetrogeneous Incomplete Data Imputation

Incomplete data are common in real-world tabular applications, where numerical, categorical, and discrete attributes coexist within a single dataset. This heterogeneous structure presents significant challenges for existing diffusion-based…

Machine Learning · Computer Science 2025-11-19 Youran Zhou , Mohamed Reda Bouadjenek , Sunil Aryal

Addressing Multilabel Imbalance with an Efficiency-Focused Approach Using Diffusion Model-Generated Synthetic Samples

Predictive models trained on imbalanced data tend to produce biased results. This problem is exacerbated when there is not just one output label, but a set of them. This is the case for multilabel learning (MLL) algorithms used to classify…

Machine Learning · Computer Science 2025-01-22 Francisco Charte , Miguel Ángel Dávila , María Dolores Pérez-Godoy , María José del Jesus

Self-Supervised Diffusion Model for 3-D Seismic Data Reconstruction

Seismic data reconstruction is an effective tool for compensating nonuniform and incomplete seismic geometry. Compared with methods for 2D seismic data, 3D reconstruction methods could consider more spatial structure correlation in seismic…

Geophysics · Physics 2024-06-21 Xinyang Wang , Qianyu Ge , Xintong Dong , Shiqi Dong , Tie Zhong

Self-Guided Diffusion Model for Accelerating Computational Fluid Dynamics

Machine learning methods, such as diffusion models, are widely explored as a promising way to accelerate high-fidelity fluid dynamics computation via a super-resolution process from faster-to-compute low-fidelity input. However, existing…

Computational Engineering, Finance, and Science · Computer Science 2025-12-24 Ruoyan Li , Zijie Huang , Haixin Wang , Guancheng Wan , Yizhou Sun , Wei Wang

FinDiff: Diffusion Models for Financial Tabular Data Generation

The sharing of microdata, such as fund holdings and derivative instruments, by regulatory institutions presents a unique challenge due to strict data confidentiality and privacy regulations. These challenges often hinder the ability of both…

Machine Learning · Computer Science 2023-09-06 Timur Sattarov , Marco Schreyer , Damian Borth

Diffusion and Flow Matching Models for Tabular Data: A Survey

Deep generative models have made rapid progress in image, text, audio, and video generation, and are increasingly being applied to structured records. For tabular data, however, generative modeling remains difficult: a dataset may contain…

Machine Learning · Computer Science 2026-05-25 Zhong Li , Qi Huang , Lincen Yang , Jiayang Shi , Zhao Yang , Niki van Stein , Thomas Bäck , Matthijs van Leeuwen