Related papers: SeqDiffuSeq: Text Diffusion with Encoder-Decoder T…

DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

Recently, diffusion models have emerged as a new paradigm for generative models. Despite the success in domains using continuous signals such as vision and audio, adapting diffusion models to natural language is under-explored due to the…

Computation and Language · Computer Science 2023-02-15 Shansan Gong , Mukai Li , Jiangtao Feng , Zhiyong Wu , Lingpeng Kong

Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration

The diffusion model, a new generative modeling paradigm, has achieved significant success in generating images, audio, video, and text. It has been adapted for sequence-to-sequence text generation (Seq2Seq) through DiffuSeq, termed S2S…

Computation and Language · Computer Science 2024-10-18 Yun-Yen Chuang , Hung-Min Hsu , Kevin Lin , Chen-Sheng Gu , Ling Zhen Li , Ray-I Chang , Hung-yi Lee

Self-conditioned Embedding Diffusion for Text Generation

Can continuous diffusion models bring the same performance breakthrough on natural language they did for image generation? To circumvent the discrete nature of text data, we can simply project tokens in a continuous space of embeddings, as…

Computation and Language · Computer Science 2022-11-09 Robin Strudel , Corentin Tallec , Florent Altché , Yilun Du , Yaroslav Ganin , Arthur Mensch , Will Grathwohl , Nikolay Savinov , Sander Dieleman , Laurent Sifre , Rémi Leblond

Empowering Diffusion Models on the Embedding Space for Text Generation

Diffusion models have achieved state-of-the-art synthesis quality on both visual and audio tasks, and recent works further adapt them to textual data by diffusing on the embedding space. In this paper, we conduct systematic studies of the…

Computation and Language · Computer Science 2024-04-23 Zhujin Gao , Junliang Guo , Xu Tan , Yongxin Zhu , Fang Zhang , Jiang Bian , Linli Xu

Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes

Diffusion models have emerged as a promising approach for text generation, with recent works falling into two main categories: discrete and continuous diffusion models. Discrete diffusion models apply token corruption independently using…

Computation and Language · Computer Science 2025-05-29 Bocheng Li , Zhujin Gao , Linli Xu

Latent Diffusion for Language Generation

Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have…

Computation and Language · Computer Science 2023-11-08 Justin Lovelace , Varsha Kishore , Chao Wan , Eliot Shekhtman , Kilian Q. Weinberger

TESS: Text-to-Text Self-Conditioned Simplex Diffusion

Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various continuous domains. However, applying continuous diffusion models to natural language remains challenging due to its discrete…

Computation and Language · Computer Science 2024-02-22 Rabeeh Karimi Mahabadi , Hamish Ivison , Jaesung Tae , James Henderson , Iz Beltagy , Matthew E. Peters , Arman Cohan

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

While diffusion models have achieved great success in generating continuous signals such as images and audio, it remains elusive for diffusion models in learning discrete sequence data like natural languages. Although recent advances…

Computation and Language · Computer Science 2024-05-02 Jiasheng Ye , Zaixiang Zheng , Yu Bao , Lihua Qian , Mingxuan Wang

TEncDM: Understanding the Properties of the Diffusion Model in the Space of Language Model Encodings

This paper presents the Text Encoding Diffusion Model (TEncDM), a novel approach to diffusion modeling that operates in the space of pre-trained language model encodings. In contrast to traditionally used embeddings, encodings integrate…

Computation and Language · Computer Science 2025-02-25 Alexander Shabalin , Viacheslav Meshchaninov , Egor Chimbulatov , Vladislav Lapikov , Roman Kim , Grigory Bartosh , Dmitry Molchanov , Sergey Markov , Dmitry Vetrov

DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation

While Diffusion Generative Models have achieved great success on image generation tasks, how to efficiently and effectively incorporate them into speech generation especially translation tasks remains a non-trivial problem. Specifically,…

Computation and Language · Computer Science 2023-10-27 Yongxin Zhu , Zhujin Gao , Xinyuan Zhou , Zhongyi Ye , Linli Xu

Diffsound: Discrete Diffusion Model for Text-to-sound Generation

Generating sound effects that humans want is an important topic. However, there are few studies in this area for sound generation. In this study, we investigate generating sound conditioned on a text prompt and propose a novel text-to-sound…

Sound · Computer Science 2023-05-01 Dongchao Yang , Jianwei Yu , Helin Wang , Wen Wang , Chao Weng , Yuexian Zou , Dong Yu

SA-DiffuSeq: Addressing Computational and Scalability Challenges in Long-Document Generation with Sparse Attention

Diffusion based approaches to long form text generation suffer from prohibitive computational cost and memory overhead as sequence length increases. We introduce SA-DiffuSeq, a diffusion framework that integrates sparse attention to…

Computation and Language · Computer Science 2025-12-25 Alexandros Christoforos , Chadbourne Davis

Discrete Diffusion for Generative Modeling of Text-Aligned Speech Tokens

This paper introduces a discrete diffusion model (DDM) framework for text-aligned speech tokenization and reconstruction. By replacing the auto-regressive speech decoder with a discrete diffusion counterpart, our model achieves…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-25 Pin-Jui Ku , He Huang , Jean-Marie Lemercier , Subham Sekhar Sahoo , Zhehuai Chen , Ante Jukić

Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models

Diffusion models have shown promise in text generation, but often struggle with generating long, coherent, and contextually accurate text. Token-level diffusion doesn't model word-order dependencies explicitly and operates on short, fixed…

Computation and Language · Computer Science 2025-05-27 Xiaochen Zhu , Georgi Karadzhov , Chenxi Whitehouse , Andreas Vlachos

TransFusion: Transcribing Speech with Multinomial Diffusion

Diffusion models have shown exceptional scaling properties in the image synthesis domain, and initial attempts have shown similar benefits for applying diffusion to unconditional text synthesis. Denoising diffusion models attempt to…

Audio and Speech Processing · Electrical Eng. & Systems 2022-10-17 Matthew Baas , Kevin Eloff , Herman Kamper

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. Starting from random noise, such text-to-image diffusion models gradually synthesize images in an iterative fashion…

Computer Vision and Pattern Recognition · Computer Science 2023-03-15 Yogesh Balaji , Seungjun Nah , Xun Huang , Arash Vahdat , Jiaming Song , Qinsheng Zhang , Karsten Kreis , Miika Aittala , Timo Aila , Samuli Laine , Bryan Catanzaro , Tero Karras , Ming-Yu Liu

Unifying Autoregressive and Diffusion-Based Sequence Generation

We present significant extensions to diffusion-based sequence generation models, blurring the line with autoregressive language models. We introduce hyperschedules, which assign distinct noise schedules to individual token positions,…

Machine Learning · Computer Science 2025-10-08 Nima Fathi , Torsten Scholak , Pierre-André Noël

DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models

In this paper, we present DesignDiffusion, a simple yet effective framework for the novel task of synthesizing design images from textual descriptions. A primary challenge lies in generating accurate and style-consistent textual and visual…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Zhendong Wang , Jianmin Bao , Shuyang Gu , Dong Chen , Wengang Zhou , Houqiang Li

DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models

We present DiffusionBERT, a new generative masked language model based on discrete diffusion models. Diffusion models and many pre-trained language models have a shared training objective, i.e., denoising, making it possible to combine the…

Computation and Language · Computer Science 2022-12-02 Zhengfu He , Tianxiang Sun , Kuanning Wang , Xuanjing Huang , Xipeng Qiu

EDSep: An Effective Diffusion-Based Method for Speech Source Separation

Generative models have attracted considerable attention for speech separation tasks, and among these, diffusion-based methods are being explored. Despite the notable success of diffusion techniques in generation tasks, their adaptation to…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-28 Jinwei Dong , Xinsheng Wang , Qirong Mao