Related papers: DiffuSIA: A Spiral Interaction Architecture for En…

Empowering Diffusion Models on the Embedding Space for Text Generation

Diffusion models have achieved state-of-the-art synthesis quality on both visual and audio tasks, and recent works further adapt them to textual data by diffusing on the embedding space. In this paper, we conduct systematic studies of the…

Computation and Language · Computer Science 2024-04-23 Zhujin Gao , Junliang Guo , Xu Tan , Yongxin Zhu , Fang Zhang , Jiang Bian , Linli Xu

SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers

Diffusion model, a new generative modelling paradigm, has achieved great success in image, audio, and video generation. However, considering the discrete categorical nature of text, it is not trivial to extend continuous diffusion models to…

Computation and Language · Computer Science 2023-05-23 Hongyi Yuan , Zheng Yuan , Chuanqi Tan , Fei Huang , Songfang Huang

DiffusER: Discrete Diffusion via Edit-based Reconstruction

In text generation, models that generate text from scratch one token at a time are currently the dominant paradigm. Despite being performant, these models lack the ability to revise existing text, which limits their usability in many…

Computation and Language · Computer Science 2022-11-01 Machel Reid , Vincent J. Hellendoorn , Graham Neubig

TransFusion: Transcribing Speech with Multinomial Diffusion

Diffusion models have shown exceptional scaling properties in the image synthesis domain, and initial attempts have shown similar benefits for applying diffusion to unconditional text synthesis. Denoising diffusion models attempt to…

Audio and Speech Processing · Electrical Eng. & Systems 2022-10-17 Matthew Baas , Kevin Eloff , Herman Kamper

The Hidden Language of Diffusion Models

Text-to-image diffusion models have demonstrated an unparalleled ability to generate high-quality, diverse images from a textual prompt. However, the internal representations learned by these models remain an enigma. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2023-10-06 Hila Chefer , Oran Lang , Mor Geva , Volodymyr Polosukhin , Assaf Shocher , Michal Irani , Inbar Mosseri , Lior Wolf

DiffusionDialog: A Diffusion Model for Diverse Dialog Generation with Latent Space

In real-life conversations, the content is diverse, and there exists the one-to-many problem that requires diverse generation. Previous studies attempted to introduce discrete or Gaussian-based continuous latent variables to address the…

Computation and Language · Computer Science 2024-04-11 Jianxiang Xiang , Zhenhua Liu , Haodong Liu , Yin Bai , Jia Cheng , Wenliang Chen

DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode

Text-driven image generation using diffusion models has recently gained significant attention. To enable more flexible image manipulation and editing, recent research has expanded from single image generation to transparent layer generation…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Junjia Huang , Pengxiang Yan , Jinhang Cai , Jiyang Liu , Zhao Wang , Yitong Wang , Xinglong Wu , Guanbin Li

TransDiffuser: Diverse Trajectory Generation with Decorrelated Multi-modal Representation for End-to-end Autonomous Driving

In recent years, diffusion models have demonstrated remarkable potential across diverse domains, from vision generation to language modeling. Transferring its generative capabilities to modern end-to-end autonomous driving systems has also…

Robotics · Computer Science 2025-09-17 Xuefeng Jiang , Yuan Ma , Pengxiang Li , Leimeng Xu , Xin Wen , Kun Zhan , Zhongpu Xia , Peng Jia , Xianpeng Lang , Sheng Sun

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

Diffusion models have achieved remarkable success across a range of generative tasks. Recent efforts to enhance diffusion model architectures have reimagined them as a form of multi-task learning, where each task corresponds to a denoising…

Computer Vision and Pattern Recognition · Computer Science 2024-07-11 Byeongjun Park , Hyojun Go , Jin-Young Kim , Sangmin Woo , Seokil Ham , Changick Kim

TextDiffuser: Diffusion Models as Text Painters

Diffusion models have gained increasing attention for their impressive generation abilities but currently struggle with rendering accurate and coherent text. To address this issue, we introduce TextDiffuser, focusing on generating images…

Computer Vision and Pattern Recognition · Computer Science 2023-10-31 Jingye Chen , Yupan Huang , Tengchao Lv , Lei Cui , Qifeng Chen , Furu Wei

Diffusion Models for Tabular Data Imputation and Synthetic Data Generation

Data imputation and data generation have important applications for many domains, like healthcare and finance, where incomplete or missing data can hinder accurate analysis and decision-making. Diffusion models have emerged as powerful…

Machine Learning · Computer Science 2025-06-10 Mario Villaizán-Vallelado , Matteo Salvatori , Carlos Segura , Ioannis Arapakis

One Diffusion to Generate Them All

We introduce OneDiffusion, a versatile, large-scale diffusion model that seamlessly supports bidirectional image synthesis and understanding across diverse tasks. It enables conditional generation from inputs such as text, depth, pose,…

Computer Vision and Pattern Recognition · Computer Science 2025-06-16 Duong H. Le , Tuan Pham , Sangho Lee , Christopher Clark , Aniruddha Kembhavi , Stephan Mandt , Ranjay Krishna , Jiasen Lu

GlyphDiffusion: Text Generation as Image Generation

Diffusion models have become a new generative paradigm for text generation. Considering the discrete categorical nature of text, in this paper, we propose GlyphDiffusion, a novel diffusion approach for text generation via text-guided image…

Computation and Language · Computer Science 2023-05-09 Junyi Li , Wayne Xin Zhao , Jian-Yun Nie , Ji-Rong Wen

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

While diffusion models have achieved great success in generating continuous signals such as images and audio, it remains elusive for diffusion models in learning discrete sequence data like natural languages. Although recent advances…

Computation and Language · Computer Science 2024-05-02 Jiasheng Ye , Zaixiang Zheng , Yu Bao , Lihua Qian , Mingxuan Wang

Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model

Existing multi-modal image fusion methods fail to address the compound degradations presented in source images, resulting in fusion images plagued by noise, color bias, improper exposure, \textit{etc}. Additionally, these methods often…

Computer Vision and Pattern Recognition · Computer Science 2024-11-01 Hao Zhang , Lei Cao , Jiayi Ma

DiffSG: A Generative Solver for Network Optimization with Diffusion Model

Generative diffusion models, famous for their performance in image generation, are popular in various cross-domain applications. However, their use in the communication community has been mostly limited to auxiliary tasks like data modeling…

Networking and Internet Architecture · Computer Science 2025-03-11 Ruihuai Liang , Bo Yang , Zhiwen Yu , Bin Guo , Xuelin Cao , Mérouane Debbah , H. Vincent Poor , Chau Yuen

Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios

Complex degradations like noise, blur, and low resolution are typical challenges in real world image fusion tasks, limiting the performance and practicality of existing methods. End to end neural network based approaches are generally…

Computer Vision and Pattern Recognition · Computer Science 2026-04-13 Yu Shi , Yu Liu , Zhong-Cheng Wu , Juan Cheng , Huafeng Li , Xun Chen

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. Starting from random noise, such text-to-image diffusion models gradually synthesize images in an iterative fashion…

Computer Vision and Pattern Recognition · Computer Science 2023-03-15 Yogesh Balaji , Seungjun Nah , Xun Huang , Arash Vahdat , Jiaming Song , Qinsheng Zhang , Karsten Kreis , Miika Aittala , Timo Aila , Samuli Laine , Bryan Catanzaro , Tero Karras , Ming-Yu Liu

Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes

Diffusion models have emerged as a promising approach for text generation, with recent works falling into two main categories: discrete and continuous diffusion models. Discrete diffusion models apply token corruption independently using…

Computation and Language · Computer Science 2025-05-29 Bocheng Li , Zhujin Gao , Linli Xu

Diffusion models for audio semantic communication

Directly sending audio signals from a transmitter to a receiver across a noisy channel may absorb consistent bandwidth and be prone to errors when trying to recover the transmitted bits. On the contrary, the recent semantic communication…

Sound · Computer Science 2023-09-15 Eleonora Grassucci , Christian Marinoni , Andrea Rodriguez , Danilo Comminiello