Related papers: DiffusionBERT: Improving Generative Masked Languag…

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise

Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation. Whereas, as a way inherently built for continuous data, existing diffusion models still have…

Computation and Language · Computer Science 2023-04-11 Jiaao Chen , Aston Zhang , Mu Li , Alex Smola , Diyi Yang

Investigating the Design Space of Diffusion Models for Speech Enhancement

Diffusion models are a new class of generative models that have shown outstanding performance in image generation literature. As a consequence, studies have attempted to apply diffusion models to other tasks, such as speech enhancement. A…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-10 Philippe Gonzalez , Zheng-Hua Tan , Jan Østergaard , Jesper Jensen , Tommy Sonne Alstrøm , Tobias May

Diffusion Buffer for Online Generative Speech Enhancement

Online Speech Enhancement was mainly reserved for predictive models. A key advantage of these models is that for an incoming signal frame from a stream of data, the model is called only once for enhancement. In contrast, generative Speech…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-22 Bunlong Lay , Rostislav Makarov , Simon Welker , Maris Hillemann , Timo Gerkmann

Diffusion Models in Vision: A Survey

Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating remarkable results in the area of generative modeling. A diffusion model is a deep generative model that is based on two stages, a forward…

Computer Vision and Pattern Recognition · Computer Science 2025-01-17 Florinel-Alin Croitoru , Vlad Hondru , Radu Tudor Ionescu , Mubarak Shah

On the Application of Diffusion Models for Simultaneous Denoising and Dereverberation

Diffusion models have been shown to achieve natural-sounding enhancement of speech degraded by noise or reverberation. However, their simultaneous denoising and dereverberation capability has so far not been studied much, although this is…

Audio and Speech Processing · Electrical Eng. & Systems 2025-08-27 Adrian Meise , Tobias Cord-Landwehr , Reinhold Haeb-Umbach

Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision

Prompt learning has demonstrated promising results in fine-tuning pre-trained multimodal models. However, the performance improvement is limited when applied to more complex and fine-grained tasks. The reason is that most existing methods…

Computer Vision and Pattern Recognition · Computer Science 2025-05-01 Weicai Yan , Wang Lin , Zirun Guo , Ye Wang , Fangming Feng , Xiaoda Yang , Zehan Wang , Tao Jin

Discrete Diffusion Models for Language Generation

Diffusion models have emerged as a powerful class of generative models, achieving state-of-the-art results in continuous data domains such as image and video generation. Their core mechanism involves a forward diffusion process that…

Computation and Language · Computer Science 2025-07-10 Ashen Weligalle

Latent Diffusion for Language Generation

Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have…

Computation and Language · Computer Science 2023-11-08 Justin Lovelace , Varsha Kishore , Chao Wan , Eliot Shekhtman , Kilian Q. Weinberger

DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation

While Diffusion Generative Models have achieved great success on image generation tasks, how to efficiently and effectively incorporate them into speech generation especially translation tasks remains a non-trivial problem. Specifically,…

Computation and Language · Computer Science 2023-10-27 Yongxin Zhu , Zhujin Gao , Xinyuan Zhou , Zhongyi Ye , Linli Xu

Speech Enhancement and Dereverberation with Diffusion-based Generative Models

In this work, we build upon our previous publication and use diffusion-based generative models for speech enhancement. We present a detailed overview of the diffusion process that is based on a stochastic differential equation and delve…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-14 Julius Richter , Simon Welker , Jean-Marie Lemercier , Bunlong Lay , Timo Gerkmann

Diffusion Boosted Trees

Combining the merits of both denoising diffusion probabilistic models and gradient boosting, the diffusion boosting paradigm is introduced for tackling supervised learning problems. We develop Diffusion Boosted Trees (DBT), which can be…

Machine Learning · Statistics 2024-06-05 Xizewen Han , Mingyuan Zhou

MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation

Recent advancements in the field of Diffusion Transformers have substantially improved the generation of high-quality 2D images, 3D videos, and 3D shapes. However, the effectiveness of the Transformer architecture in the domain of co-speech…

Computer Vision and Pattern Recognition · Computer Science 2024-08-07 Xiaofeng Mao , Zhengkai Jiang , Qilin Wang , Chencan Fu , Jiangning Zhang , Jiafu Wu , Yabiao Wang , Chengjie Wang , Wei Li , Mingmin Chi

Search-Augmented Masked Diffusion Models for Constrained Generation

Discrete diffusion models generate sequences by iteratively denoising samples corrupted by categorical noise, offering an appealing alternative to autoregressive decoding for structured and symbolic generation. However, standard training…

Machine Learning · Computer Science 2026-02-04 Huu Binh Ta , Michael Cardei , Alvaro Velasquez , Ferdinando Fioretto

Think While You Generate: Discrete Diffusion with Planned Denoising

Discrete diffusion has achieved state-of-the-art performance, outperforming or approaching autoregressive models on standard benchmarks. In this work, we introduce Discrete Diffusion with Planned Denoising (DDPD), a novel framework that…

Machine Learning · Computer Science 2025-04-11 Sulin Liu , Juno Nam , Andrew Campbell , Hannes Stärk , Yilun Xu , Tommi Jaakkola , Rafael Gómez-Bombarelli

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery

Learning from a large corpus of data, pre-trained models have achieved impressive progress nowadays. As popular generative pre-training, diffusion models capture both low-level visual knowledge and high-level semantic relations. In this…

Computer Vision and Pattern Recognition · Computer Science 2023-03-20 Chaofan Ma , Yuhuan Yang , Chen Ju , Fei Zhang , Jinxiang Liu , Yu Wang , Ya Zhang , Yanfeng Wang

Improving BERT with Hybrid Pooling Network and Drop Mask

Transformer-based pre-trained language models, such as BERT, achieve great success in various natural language understanding tasks. Prior research found that BERT captures a rich hierarchy of linguistic information at different layers.…

Computation and Language · Computer Science 2023-07-17 Qian Chen , Wen Wang , Qinglin Zhang , Chong Deng , Ma Yukun , Siqi Zheng

Distilling Knowledge Learned in BERT for Text Generation

Large-scale pre-trained language model such as BERT has achieved great success in language understanding tasks. However, it remains an open question how to utilize BERT for language generation. In this paper, we present a novel approach,…

Computation and Language · Computer Science 2020-07-21 Yen-Chun Chen , Zhe Gan , Yu Cheng , Jingzhou Liu , Jingjing Liu

DiffuseDef: Improved Robustness to Adversarial Attacks via Iterative Denoising

Pretrained language models have significantly advanced performance across various natural language processing tasks. However, adversarial attacks continue to pose a critical challenge to systems built using these models, as they can be…

Computation and Language · Computer Science 2025-05-20 Zhenhao Li , Huichi Zhou , Marek Rei , Lucia Specia

Denoising Diffusion Semantic Segmentation with Mask Prior Modeling

The evolution of semantic segmentation has long been dominated by learning more discriminative image representations for classifying each pixel. Despite the prominent advancements, the priors of segmentation masks themselves, e.g.,…

Computer Vision and Pattern Recognition · Computer Science 2023-06-23 Zeqiang Lai , Yuchen Duan , Jifeng Dai , Ziheng Li , Ying Fu , Hongsheng Li , Yu Qiao , Wenhai Wang

Generative Modeling with Diffusion

We provide an overview of the diffusion model as a method to generate new samples. Generative models have been recently adopted for tasks such as art generation (Stable Diffusion, Dall-E) and text generation (ChatGPT). Diffusion models in…

Machine Learning · Statistics 2025-06-13 Justin Le