Related papers: Dystruct: Dynamically Structured Diffusion Languag…

Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way

Diffusion-based large language models (dLLMs) have exhibited substantial potential for parallel text generation, which may enable more efficient generation compared to autoregressive models. However, current dLLMs suffer from fixed…

Computation and Language · Computer Science 2025-10-29 Yicun Yang , Cong Wang , Shaobo Wang , Zichen Wen , Biqing Qi , Hanlin Xu , Linfeng Zhang

Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models

Diffusion Large Language Models (DLLMs) are emerging as a powerful alternative to the dominant Autoregressive Large Language Models, offering efficient parallel generation and capable global context modeling. However, the practical…

Computation and Language · Computer Science 2025-08-19 Jinsong Li , Xiaoyi Dong , Yuhang Zang , Yuhang Cao , Jiaqi Wang , Dahua Lin

DSB: Dynamic Sliding Block Scheduling for Diffusion LLMs

Diffusion large language models (dLLMs) have emerged as a promising alternative for text generation, distinguished by their native support for parallel decoding. In practice, block inference is crucial for avoiding order misalignment in…

Computation and Language · Computer Science 2026-03-17 Lizhuo Luo , Shenggui Li , Yonggang Wen , Tianwei Zhang

DreamOn: Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas

Diffusion Language Models (DLMs) present a compelling alternative to autoregressive models, offering flexible, any-order infilling without specialized prompting design. However, their practical utility is blocked by a critical limitation:…

Computation and Language · Computer Science 2026-02-03 Zirui Wu , Lin Zheng , Zhihui Xie , Jiacheng Ye , Jiahui Gao , Shansan Gong , Yansong Feng , Zhenguo Li , Wei Bi , Guorui Zhou , Lingpeng Kong

How to Train Your Latent Diffusion Language Model Jointly With the Latent Space

Latent diffusion models offer an attractive alternative to discrete diffusion for non-autoregressive text generation by operating on continuous text representations and denoising entire sequences in parallel. The major challenge in latent…

Computation and Language · Computer Science 2026-05-11 Viacheslav Meshchaninov , Alexander Shabalin , Egor Chimbulatov , Nikita Gushchin , Ilya Koziev , Alexander Korotin , Dmitry Vetrov

Deferred Commitment Decoding for Diffusion Language Models

Diffusion language models (DLMs) have recently emerged as a strong alternative to autoregressive models by enabling parallel text generation. To improve inference efficiency and KV-cache compatibility, prior work commonly adopts block-based…

Computation and Language · Computer Science 2026-01-21 Yingte Shu , Yuchuan Tian , Chao Xu , Yunhe Wang , Hanting Chen

DepCap: Adaptive Block-Wise Parallel Decoding for Efficient Diffusion LM Inference

Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive language generation due to their potential for parallel decoding and global refinement of the entire sequence. To unlock this potential, DLM…

Machine Learning · Computer Science 2026-04-20 Xiang Xia , Wuyang Zhang , Jiazheng Liu , Cheng Yan , Yanyong Zhang

Improving Variable-Length Generation in Diffusion Language Models via Length Regularization

Diffusion Large Language Models (DLLMs) are inherently ill-suited for variable-length generation, as their inference is defined on a fixed-length canvas and implicitly assumes a known target length. When the length is unknown, as in…

Computation and Language · Computer Science 2026-02-10 Zicong Cheng , Ruixuan Jia , Jia Li , Guo-Wei Yang , Meng-Hao Guo , Shi-Min Hu

Diffusion Language Models Are Natively Length-Aware

Unlike autoregressive language models, which terminate variable-length generation upon predicting an End-of-Sequence (EoS) token, Diffusion Language Models (DLMs) operate over a fixed maximum-length context window for a predetermined number…

Computation and Language · Computer Science 2026-03-09 Vittorio Rossi , Giacomo Cirò , Davide Beltrame , Luca Gandolfi , Paul Röttger , Dirk Hovy

A Survey on Diffusion Language Models

Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent…

Computation and Language · Computer Science 2025-12-08 Tianyi Li , Mingda Chen , Bowei Guo , Zhiqiang Shen

Decoding Large Language Diffusion Models with Foreseeing Movement

Large Language Diffusion Models (LLDMs) benefit from a flexible decoding mechanism that enables parallelized inference and controllable generations over autoregressive models. Yet such flexibility introduces a critical challenge: inference…

Machine Learning · Computer Science 2025-12-05 Yichuan Mo , Quan Chen , Mingjie Li , Zeming Wei , Yisen Wang

DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models

Recent advancements in large language models (LLMs) have significantly enhanced their knowledge and generative capabilities, leading to a surge of interest in leveraging LLMs for high-quality data synthesis. However, synthetic data…

Machine Learning · Computer Science 2025-06-11 Ying Zhou , Xinyao Wang , Yulei Niu , Yaojie Shen , Lexin Tang , Fan Chen , Ben He , Le Sun , Longyin Wen

WavefrontDiffusion: Dynamic Decoding Schedule for Improved Reasoning

Diffusion Language Models (DLMs) have shown strong potential for text generation and are becoming a competitive alternative to autoregressive models. The denoising strategy plays an important role in determining the quality of their…

Machine Learning · Computer Science 2026-03-03 Haojin Yang , Rui Hu , Zequn Sun , Rui Zhou , Yujun Cai , Yiwei Wang

Sequential Diffusion Language Models

Diffusion language models (DLMs) have strong theoretical efficiency but are limited by fixed-length decoding and incompatibility with key-value (KV) caches. Block diffusion mitigates these issues, yet still enforces a fixed block size and…

Computation and Language · Computer Science 2025-09-30 Yangzhou Liu , Yue Cao , Hao Li , Gen Luo , Zhe Chen , Weiyun Wang , Xiaobo Liang , Biqing Qi , Lijun Wu , Changyao Tian , Yanting Zhang , Yuqiang Li , Tong Lu , Yu Qiao , Jifeng Dai , Wenhai Wang

Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models

Discrete diffusion language models (DLMs) generate text by iteratively denoising all positions in parallel, offering an alternative to autoregressive models. Controlled generation methods for DLMs, imported from autoregressive models, apply…

Machine Learning · Computer Science 2026-05-13 Hanhan Zhou , Shamik Roy , Rashmi Gangadharaiah

Fast Byte Latent Transformer

Recent byte-level language models (LMs) match the performance of token-level models without relying on subword vocabularies, yet their utility is limited by slow, byte-by-byte autoregressive generation. We address this bottleneck in the…

Computation and Language · Computer Science 2026-05-11 Julie Kallini , Artidoro Pagnoni , Tomasz Limisiewicz , Gargi Ghosh , Luke Zettlemoyer , Christopher Potts , Xiaochuang Han , Srinivasan Iyer

Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration

Diffusion large language models (dLLMs) have recently attracted significant attention for their ability to enhance diversity, controllability, and parallelism. However, their non-sequential, bidirectionally masked generation makes quality…

Computation and Language · Computer Science 2026-03-04 Linhao Zhong , Linyu Wu , Wen Wang , Yuling Xi , Chenchen Jing , Jiaheng Zhang , Hao Chen , Chunhua Shen

Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models

Diffusion-based language models (dLLMs) have emerged as a promising alternative to autoregressive language models, offering the potential for parallel token generation and bidirectional context modeling. However, harnessing this flexibility…

Computation and Language · Computer Science 2026-05-28 Jiyeon Kim , Sungik Choi , Yongrae Jo , Moontae Lee , Minjoon Seo

DiffuRank: Effective Document Reranking with Diffusion Language Models

Recent advances in large language models (LLMs) have inspired new paradigms for document reranking. While this paradigm better exploits the reasoning and contextual understanding capabilities of LLMs, most existing LLM-based rerankers rely…

Information Retrieval · Computer Science 2026-02-16 Qi Liu , Kun Ai , Jiaxin Mao , Yanzhao Zhang , Mingxin Li , Dingkun Long , Pengjun Xie , Fengbin Zhu , Ji-Rong Wen

DFlash: Block Diffusion for Flash Speculative Decoding

Autoregressive large language models (LLMs) deliver strong performance but require inherently sequential decoding, leading to high inference latency and poor GPU utilization. Speculative decoding mitigates this bottleneck by using a fast…

Computation and Language · Computer Science 2026-05-29 Jian Chen , Yesheng Liang , Zhijian Liu