Related papers: Dream-Coder 7B: An Open Diffusion Language Model f…

Dream 7B: Diffusion Large Language Models

We introduce Dream 7B, the most powerful open diffusion large language model to date. Unlike autoregressive (AR) models that generate tokens sequentially, Dream 7B employs discrete diffusion modeling to refine sequences in parallel through…

Computation and Language · Computer Science 2025-08-22 Jiacheng Ye , Zhihui Xie , Lin Zheng , Jiahui Gao , Zirui Wu , Xin Jiang , Zhenguo Li , Lingpeng Kong

DreamOn: Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas

Diffusion Language Models (DLMs) present a compelling alternative to autoregressive models, offering flexible, any-order infilling without specialized prompting design. However, their practical utility is blocked by a critical limitation:…

Computation and Language · Computer Science 2026-02-03 Zirui Wu , Lin Zheng , Zhihui Xie , Jiacheng Ye , Jiahui Gao , Shansan Gong , Yansong Feng , Zhenguo Li , Wei Bi , Guorui Zhou , Lingpeng Kong

FlashDLM: Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion

Diffusion language models offer parallel token generation and inherent bidirectionality, promising more efficient and powerful sequence modeling compared to autoregressive approaches. However, state-of-the-art diffusion models (e.g., Dream…

Computation and Language · Computer Science 2025-10-10 Zhanqiu Hu , Jian Meng , Yash Akhauri , Mohamed S. Abdelfattah , Jae-sun Seo , Zhiru Zhang , Udit Gupta

DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models

Diffusion language models generate without a fixed left-to-right order, making token ordering a central algorithmic choice: which tokens should be revealed, retained, revised or verified at each step? Existing systems mainly use random…

Machine Learning · Computer Science 2026-04-28 Dake Bu , Wei Huang , Andi Han , Hau-San Wong , Qingfu Zhang , Taiji Suzuki , Atsushi Nitanda

CoDA: Coding LM via Diffusion Adaptation

Diffusion language models promise bidirectional context and infilling capabilities that autoregressive coders lack, yet practical systems remain heavyweight. We introduce CoDA, a 1.7B-parameter diffusion coder trained on TPU with a fully…

Machine Learning · Computer Science 2025-10-07 Haolin Chen , Shiyu Wang , Can Qin , Bo Pang , Zuxin Liu , Jielin Qiu , Jianguo Zhang , Yingbo Zhou , Zeyuan Chen , Ran Xu , Shelby Heinecke , Silvio Savarese , Caiming Xiong , Huan Wang , Weiran Yao

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

Diffusion models have gained significant attention in the realm of image generation due to their exceptional performance. Their success has been recently expanded to text generation via generating all tokens within a sequence concurrently.…

Computation and Language · Computer Science 2023-12-14 Tong Wu , Zhihao Fan , Xiao Liu , Yeyun Gong , Yelong Shen , Jian Jiao , Hai-Tao Zheng , Juntao Li , Zhongyu Wei , Jian Guo , Nan Duan , Weizhu Chen

ReCode: Reinforcing Code Generation with Reasoning-Process Rewards

In practice, rigorous reasoning is often a key driver of correct code, while Reinforcement Learning (RL) for code generation often neglects optimizing reasoning quality. Bringing process-level supervision into RL is appealing, but it faces…

Software Engineering · Computer Science 2026-05-06 Lishui Fan , Yu Zhang , Mouxiang Chen , Zhongxin Liu

DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation

Diffusion large language models (dLLMs) are compelling alternatives to autoregressive (AR) models because their denoising models operate over the entire sequence. The global planning and iterative refinement features of dLLMs are…

Computation and Language · Computer Science 2025-06-27 Shansan Gong , Ruixiang Zhang , Huangjie Zheng , Jiatao Gu , Navdeep Jaitly , Lingpeng Kong , Yizhe Zhang

Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Diffusion-based language models (DLLMs) offer non-sequential, block-wise generation and richer data reuse compared to autoregressive (AR) models, but existing code DLLMs still lag behind strong AR baselines under comparable budgets. We…

Computation and Language · Computer Science 2026-01-26 Chenghao Fan , Wen Heng , Bo Li , Sichen Liu , Yuxuan Song , Jing Su , Xiaoye Qu , Kai Shen , Wei Wei

Scaling Diffusion Language Models via Adaptation from Autoregressive Models

Diffusion Language Models (DLMs) have emerged as a promising new paradigm for text generative modeling, potentially addressing limitations of autoregressive (AR) models. However, current DLMs have been studied at a smaller scale compared to…

Computation and Language · Computer Science 2025-06-03 Shansan Gong , Shivam Agarwal , Yizhe Zhang , Jiacheng Ye , Lin Zheng , Mukai Li , Chenxin An , Peilin Zhao , Wei Bi , Jiawei Han , Hao Peng , Lingpeng Kong

Soft-Masked Diffusion Language Models

Diffusion models have demonstrated strong potential in language modeling, offering various advantages over traditional autoregressive approaches. Their ability to generate and revise entire responses in parallel enables faster generation…

Machine Learning · Computer Science 2026-03-03 Michael Hersche , Samuel Moor-Smith , Thomas Hofmann , Abbas Rahimi

Beyond Autoregression: An Empirical Study of Diffusion Large Language Models for Code Generation

LLMs have become the mainstream approaches to code generation. Existing LLMs mainly employ autoregressive generation, i.e. generating code token-by-token from left to right. However, the underlying autoregressive generation has two…

Software Engineering · Computer Science 2025-11-04 Chengze Li , Yitong Zhang , Jia Li , Liyi Cai , Ge Li

AnCoder: Anchored Code Generation via Discrete Diffusion Models

Diffusion language models offer a compelling alternative to autoregressive code generation, enabling global planning and iterative refinement of complex program logic. However, existing approaches fail to respect the rigid structure of…

Machine Learning · Computer Science 2026-02-23 Anton Xue , Litu Rout , Constantine Caramanis , Sanjay Shakkottai

Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models

We propose TraceRL, a trajectory-aware reinforcement learning framework for diffusion language models (DLMs) that incorporates preferred inference trajectory into post-training, and is applicable across different architectures. Equipped…

Computation and Language · Computer Science 2025-09-09 Yinjie Wang , Ling Yang , Bowen Li , Ye Tian , Ke Shen , Mengdi Wang

ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

While Large Language Models (LLMs) have revolutionized code generation, standard ``System 1'' approaches that generate solutions in a single forward pass often hit a performance ceiling on complex algorithmic tasks. Existing iterative…

Computation and Language · Computer Science 2026-04-21 Juyong Jiang , Jiasi Shen , Sunghun Kim , Kang Min Yoo , Jeonghoon Kim , Sungju Kim

Autoregressive Models Rival Diffusion Models at ANY-ORDER Generation

Diffusion language models enable any-order generation and bidirectional conditioning, offering appealing flexibility for tasks such as infilling, rewriting, and self-correction. However, their formulation-predicting one part of a sequence…

Computation and Language · Computer Science 2026-01-21 Tianqi Du , Lizhe Fang , Weijie Yang , Chenheng Zhang , Zeming Wei , Yifei Wang , Yisen Wang

Discrete Diffusion Models for Language Generation

Diffusion models have emerged as a powerful class of generative models, achieving state-of-the-art results in continuous data domains such as image and video generation. Their core mechanism involves a forward diffusion process that…

Computation and Language · Computer Science 2025-07-10 Ashen Weligalle

Towards Better Correctness and Efficiency in Code Generation

While code large language models have demonstrated remarkable progress in code generation, the generated code often exhibits poor runtime efficiency, limiting its practical application in performance-sensitive scenarios. To address this…

Software Engineering · Computer Science 2025-08-29 Yunlong Feng , Yang Xu , Xiao Xu , Binyuan Hui , Junyang Lin

Constrained Code Generation with Discrete Diffusion

Discrete diffusion models are a powerful, emerging paradigm for code generation. They construct programs through iterative refinement of partially corrupted token sequences and enable parallel token refinement. Importantly, this paradigm…

Computation and Language · Computer Science 2026-05-19 Lize Shao , Michael Cardei , Zichen Xie , Ferdinando Fioretto , Wenxi Wang

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

We introduce Transfusion, a recipe for training a multi-modal model over discrete and continuous data. Transfusion combines the language modeling loss function (next token prediction) with diffusion to train a single transformer over…

Artificial Intelligence · Computer Science 2024-08-21 Chunting Zhou , Lili Yu , Arun Babu , Kushal Tirumala , Michihiro Yasunaga , Leonid Shamis , Jacob Kahn , Xuezhe Ma , Luke Zettlemoyer , Omer Levy