English
Related papers

Related papers: Transfer Learning for Text Diffusion Models

200 papers

Diffusion Language Models (DLMs) have emerged as a promising new paradigm for text generative modeling, potentially addressing limitations of autoregressive (AR) models. However, current DLMs have been studied at a smaller scale compared to…

Computation and Language · Computer Science 2025-06-03 Shansan Gong , Shivam Agarwal , Yizhe Zhang , Jiacheng Ye , Lin Zheng , Mukai Li , Chenxin An , Peilin Zhao , Wei Bi , Jiawei Han , Hao Peng , Lingpeng Kong

Diffusion-based decoding has recently emerged as an appealing alternative to autoregressive (AR) generation, offering the potential to update multiple tokens in parallel and reduce latency. However, diffusion vision language models (dVLMs)…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Lunbin Zeng , Jingfeng Yao , Bencheng Liao , Hongyuan Tao , Wenyu Liu , Xinggang Wang

Large language model (LLM)-based embedding models, benefiting from large scale pre-training and post-training, have begun to surpass BERT and T5-based models on general-purpose text embedding tasks such as document retrieval. However, a…

Computation and Language · Computer Science 2025-05-22 Siyue Zhang , Yilun Zhao , Liyuan Geng , Arman Cohan , Anh Tuan Luu , Chen Zhao

Latent diffusion models offer an attractive alternative to discrete diffusion for non-autoregressive text generation by operating on continuous text representations and denoising entire sequences in parallel. The major challenge in latent…

Computation and Language · Computer Science 2026-05-11 Viacheslav Meshchaninov , Alexander Shabalin , Egor Chimbulatov , Nikita Gushchin , Ilya Koziev , Alexander Korotin , Dmitry Vetrov

Autoregressive (AR) language models generate text one token at a time, which limits their inference speed. Diffusion-based language models offer a promising alternative, as they can decode multiple tokens in parallel. However, we identify a…

Computation and Language · Computer Science 2025-10-27 Yeongbin Seo , Dongha Lee , Jaehyung Kim , Jinyoung Yeo

Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to autoregressive (AR) LLMs for text generation, with the potential to decode multiple tokens in a single iteration. However, none of the existing open-source…

Machine Learning · Computer Science 2025-08-14 Xu Wang , Chenkai Xu , Yijie Jin , Jiachun Jin , Hao Zhang , Zhijie Deng

Diffusion language models (dLMs) have emerged as a promising paradigm that enables parallel, non-autoregressive generation, but their learning efficiency lags behind that of autoregressive (AR) language models when trained from scratch. To…

We introduce TESS 2, a general instruction-following diffusion language model that outperforms contemporary instruction-tuned diffusion models, as well as matches and sometimes exceeds strong autoregressive (AR) models. We train TESS 2 by…

Computation and Language · Computer Science 2025-06-03 Jaesung Tae , Hamish Ivison , Sachin Kumar , Arman Cohan

Recent advances in large language models (LLMs) have attracted significant interest in extending their capabilities to multimodal scenarios, particularly for speech-to-speech conversational systems. However, existing multimodal models…

Computation and Language · Computer Science 2026-03-26 Tianqiao Liu , Xueyi Li , Hao Wang , Haoxuan Li , Zhichao Chen , Weiqi Luo , Zitao Liu

Recently, diffusion models have excelled in image generation tasks and have also been applied to neural language processing (NLP) for controllable text generation. However, the application of diffusion models in a cross-lingual setting is…

Computation and Language · Computer Science 2023-08-01 Linyao Chen , Aosong Feng , Boming Yang , Zihui Li

Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent…

Computation and Language · Computer Science 2025-12-08 Tianyi Li , Mingda Chen , Bowei Guo , Zhiqiang Shen

Groundbreaking advancements in text-to-image generation have recently been achieved with the emergence of diffusion models. These models exhibit a remarkable ability to generate highly artistic and intricately detailed images based on…

Computer Vision and Pattern Recognition · Computer Science 2025-02-10 Ziyi Dong , Yao Xiao , Pengxu Wei , Liang Lin

The diffusion model has been proven a powerful generative model in recent years, yet remains a challenge in generating visual text. Several methods alleviated this issue by incorporating explicit text position and content as guidance on…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Jingye Chen , Yupan Huang , Tengchao Lv , Lei Cui , Qifeng Chen , Furu Wei

Diffusion language models (DLMs) have recently demonstrated capabilities that complement standard autoregressive (AR) models, particularly in non-sequential generation and bidirectional editing. Although recent work has shown that…

Machine Learning · Computer Science 2026-05-11 Fred Zhangzhi Peng , Alexis Fox , Anru R. Zhang , Alexander Tong

Text-to-image diffusion models are a class of deep generative models that have demonstrated an impressive capacity for high-quality image generation. However, these models are susceptible to implicit biases that arise from web-scale…

Computer Vision and Pattern Recognition · Computer Science 2024-01-24 Yinan Zhang , Eric Tzeng , Yilun Du , Dmitry Kislyuk

Diffusion Large Language Models (DLLMs) have emerged as a powerful alternative to autoregressive models, enabling parallel token generation across multiple positions. However, preference alignment of DLLMs remains challenging due to high…

Computation and Language · Computer Science 2026-02-04 Liang Lin , Feng Xiong , Zengbin Wang , Kun Wang , Junhao Dong , Xuecai Hu , Yong Wang , Xiangxiang Chu

Under strictly controlled pre-training settings, we observe a Crossover: when unique data is limited, diffusion language models (DLMs) consistently surpass autoregressive (AR) models by training for more epochs. The crossover shifts later…

Machine Learning · Computer Science 2025-11-06 Jinjie Ni , Qian Liu , Longxu Dou , Chao Du , Zili Wang , Hang Yan , Tianyu Pang , Michael Qizhe Shieh

Autoregressive (AR) models have long dominated the landscape of large language models, driving progress across a wide range of tasks. Recently, diffusion-based language models have emerged as a promising alternative, though their advantages…

Machine Learning · Computer Science 2025-10-28 Mihir Prabhudesai , Mengning Wu , Amir Zadeh , Katerina Fragkiadaki , Deepak Pathak

This paper does not describe a new method; instead, it provides a thorough exploration of an important yet understudied design space related to recent advances in text-to-image synthesis -- specifically, the deep fusion of large language…

Computer Vision and Pattern Recognition · Computer Science 2025-05-16 Bingda Tang , Boyang Zheng , Xichen Pan , Sayak Paul , Saining Xie

Diffusion models have gained significant attention in the realm of image generation due to their exceptional performance. Their success has been recently expanded to text generation via generating all tokens within a sequence concurrently.…

Computation and Language · Computer Science 2023-12-14 Tong Wu , Zhihao Fan , Xiao Liu , Yeyun Gong , Yelong Shen , Jian Jiao , Hai-Tao Zheng , Juntao Li , Zhongyu Wei , Jian Guo , Nan Duan , Weizhu Chen
‹ Prev 1 2 3 10 Next ›