English
Related papers

Related papers: Self-Supervision Improves Diffusion Models for Tab…

200 papers

Diffusion models have recently emerged as powerful tools for missing data imputation by modeling the joint distribution of observed and unobserved variables. However, existing methods, typically based on stochastic denoising diffusion…

Artificial Intelligence · Computer Science 2025-08-06 Youran Zhou , Mohamed Reda Bouadjenek , Sunil Aryal

Data imputation and data generation have important applications for many domains, like healthcare and finance, where incomplete or missing data can hinder accurate analysis and decision-making. Diffusion models have emerged as powerful…

Machine Learning · Computer Science 2025-06-10 Mario Villaizán-Vallelado , Matteo Salvatori , Carlos Segura , Ioannis Arapakis

Tabular data plays a crucial role in various domains but often suffers from missing values, thereby curtailing its potential utility. Traditional imputation techniques frequently yield suboptimal results and impose substantial computational…

Machine Learning · Computer Science 2024-03-22 Yizhu Wen , Kai Yi , Jing Ke , Yiqing Shen

Synthesizing high-quality tabular data is an important topic in many data science tasks, ranging from dataset augmentation to privacy protection. However, developing expressive generative models for tabular data is challenging due to its…

Machine Learning · Computer Science 2025-02-18 Juntong Shi , Minkai Xu , Harper Hua , Hengrui Zhang , Stefano Ermon , Jure Leskovec

Denoising diffusion probabilistic models are currently becoming the leading paradigm of generative modeling for many important data modalities. Being the most prevalent in the computer vision community, diffusion models have also recently…

Machine Learning · Computer Science 2024-10-08 Akim Kotelnikov , Dmitry Baranchuk , Ivan Rubachev , Artem Babenko

Anomaly detection in multivariate time series data is of paramount importance for ensuring the efficient operation of large-scale systems across diverse domains. However, accurately detecting anomalies in such data poses significant…

Machine Learning · Computer Science 2023-11-15 Yuhang Chen , Chaoyun Zhang , Minghua Ma , Yudong Liu , Ruomeng Ding , Bowen Li , Shilin He , Saravan Rajmohan , Qingwei Lin , Dongmei Zhang

Diffusion models are increasingly being utilised to create synthetic tabular and time series data for privacy-preserving augmentation. Tabular Denoising Diffusion Probabilistic Models (TabDDPM) generate high-quality synthetic data from…

Machine Learning · Computer Science 2026-04-08 Umang Dobhal , Christina Garcia , Sozo Inoue

Diffusion models have recently achieved remarkable performance in image super-resolution (SR), but their high computational cost limits practical deployment in remote sensing applications. To address this issue, we propose SlimDiffSR, a…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Ce Wang , Zhenyu Hu , Wanjie Sun

Missing value imputation in machine learning is the task of estimating the missing values in the dataset accurately using available information. In this task, several deep generative modeling methods have been proposed and demonstrated…

Machine Learning · Computer Science 2023-03-14 Shuhan Zheng , Nontawat Charoenphakdee

Tables are an abundant form of data with use cases across all scientific fields. Real-world datasets often contain anomalous samples that can negatively affect downstream analysis. In this work, we only assume access to contaminated data…

Machine Learning · Computer Science 2023-07-25 Guy Zamberg , Moshe Salhov , Ofir Lindenbaum , Amir Averbuch

Diffusion models have recently shown promise in time series forecasting, particularly for probabilistic predictions. However, they often fail to achieve state-of-the-art point estimation performance compared to regression-based methods.…

Artificial Intelligence · Computer Science 2025-11-25 Hang Ding , Xue Wang , Tian Zhou , Tao Yao

Diffusion-based data augmentation (DiffDA) has emerged as a promising approach to improving classification performance under data scarcity. However, existing works vary significantly in task configurations, model choices, and experimental…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Zekun Li , Yinghuan Shi , Yang Gao , Dong Xu

Point supervision has become a scalable solution to address dense annotation for infrared small target detection, but its performance is limited by two coupled bottlenecks: unstable pseudo-label evolution in cluttered, low-contrast infrared…

Computer Vision and Pattern Recognition · Computer Science 2026-05-21 Zhu Liu , Yuanhang Yao , Ping Qian , Zihang Chen , Risheng Liu

Spatial time series imputation is critically important to many real applications such as intelligent transportation and air quality monitoring. Although recent transformer and diffusion model based approaches have achieved significant…

Machine Learning · Computer Science 2023-09-06 Shunyang Zhang , Senzhang Wang , Xianzhen Tan , Ruochen Liu , Jian Zhang , Jianxin Wang

Incomplete data are common in real-world tabular applications, where numerical, categorical, and discrete attributes coexist within a single dataset. This heterogeneous structure presents significant challenges for existing diffusion-based…

Machine Learning · Computer Science 2025-11-19 Youran Zhou , Mohamed Reda Bouadjenek , Sunil Aryal

Predictive models trained on imbalanced data tend to produce biased results. This problem is exacerbated when there is not just one output label, but a set of them. This is the case for multilabel learning (MLL) algorithms used to classify…

Machine Learning · Computer Science 2025-01-22 Francisco Charte , Miguel Ángel Dávila , María Dolores Pérez-Godoy , María José del Jesus

Seismic data reconstruction is an effective tool for compensating nonuniform and incomplete seismic geometry. Compared with methods for 2D seismic data, 3D reconstruction methods could consider more spatial structure correlation in seismic…

Geophysics · Physics 2024-06-21 Xinyang Wang , Qianyu Ge , Xintong Dong , Shiqi Dong , Tie Zhong

Machine learning methods, such as diffusion models, are widely explored as a promising way to accelerate high-fidelity fluid dynamics computation via a super-resolution process from faster-to-compute low-fidelity input. However, existing…

Computational Engineering, Finance, and Science · Computer Science 2025-12-24 Ruoyan Li , Zijie Huang , Haixin Wang , Guancheng Wan , Yizhou Sun , Wei Wang

The sharing of microdata, such as fund holdings and derivative instruments, by regulatory institutions presents a unique challenge due to strict data confidentiality and privacy regulations. These challenges often hinder the ability of both…

Machine Learning · Computer Science 2023-09-06 Timur Sattarov , Marco Schreyer , Damian Borth

Deep generative models have made rapid progress in image, text, audio, and video generation, and are increasingly being applied to structured records. For tabular data, however, generative modeling remains difficult: a dataset may contain…

Machine Learning · Computer Science 2026-05-25 Zhong Li , Qi Huang , Lincen Yang , Jiayang Shi , Zhao Yang , Niki van Stein , Thomas Bäck , Matthijs van Leeuwen
‹ Prev 1 2 3 10 Next ›