Related papers: Improving Variable-Length Generation in Diffusion …

DreamOn: Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas

Diffusion Language Models (DLMs) present a compelling alternative to autoregressive models, offering flexible, any-order infilling without specialized prompting design. However, their practical utility is blocked by a critical limitation:…

Computation and Language · Computer Science 2026-02-03 Zirui Wu , Lin Zheng , Zhihui Xie , Jiacheng Ye , Jiahui Gao , Shansan Gong , Yansong Feng , Zhenguo Li , Wei Bi , Guorui Zhou , Lingpeng Kong

Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way

Diffusion-based large language models (dLLMs) have exhibited substantial potential for parallel text generation, which may enable more efficient generation compared to autoregressive models. However, current dLLMs suffer from fixed…

Computation and Language · Computer Science 2025-10-29 Yicun Yang , Cong Wang , Shaobo Wang , Zichen Wen , Biqing Qi , Hanlin Xu , Linfeng Zhang

Diffusion LMs Can Approximate Optimal Infilling Lengths Implicitly

Diffusion language models (DLMs) provide a bidirectional generation framework naturally suited for infilling, yet their performance is constrained by the pre-specified infilling length. In this paper, we reveal that DLMs possess an inherent…

Machine Learning · Computer Science 2026-02-03 Hengchang Liu , Zhao Yang , Bing Su

Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models

Diffusion Large Language Models (DLLMs) are emerging as a powerful alternative to the dominant Autoregressive Large Language Models, offering efficient parallel generation and capable global context modeling. However, the practical…

Computation and Language · Computer Science 2025-08-19 Jinsong Li , Xiaoyi Dong , Yuhang Zang , Yuhang Cao , Jiaqi Wang , Dahua Lin

Corrective Diffusion Language Models

While Diffusion Language Models (DLMs) are theoretically well-suited for iterative refinement due to their non-causal structure, they often fail to reliably revise incorrect tokens in practice. The key challenge lies in the model's…

Machine Learning · Computer Science 2026-01-30 Shuibai Zhang , Fred Zhangzhi Peng , Yiheng Zhang , Jin Pan , Grigorios G. Chrysos

Dystruct: Dynamically Structured Diffusion Language Model Decoding via Bayesian Inference

Diffusion language models (DLMs) have recently emerged as a promising alternative to autoregressive models, primarily due to their ability to enable parallel decoding. Despite this advantage, most existing DLMs rely on a fixed generation…

Machine Learning · Computer Science 2026-05-12 Bian Sun , Kevin Zhai , Mubarak Shah , Zhenyi Wang

Set-Valued Prediction for Large Language Models with Feasibility-Aware Coverage Guarantees

Large language models (LLMs) inherently operate over a large generation space, yet conventional usage typically reports the most likely generation (MLG) as a point prediction, which underestimates the model's capability: although the…

Computation and Language · Computer Science 2026-03-25 Ye Li , Anqi Hu , Yuanchang Ye , Shiyan Tong , Zhiyuan Wang , Bo Fu

Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration

Diffusion large language models (dLLMs) have recently attracted significant attention for their ability to enhance diversity, controllability, and parallelism. However, their non-sequential, bidirectionally masked generation makes quality…

Computation and Language · Computer Science 2026-03-04 Linhao Zhong , Linyu Wu , Wen Wang , Yuling Xi , Chenchen Jing , Jiaheng Zhang , Hao Chen , Chunhua Shen

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

While autoregressive Large Vision-Language Models (VLMs) have achieved remarkable success, their sequential generation often limits their efficacy in complex visual planning and dynamic robotic control. In this work, we investigate the…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Jiacheng Ye , Shansan Gong , Jiahui Gao , Junming Fan , Shuang Wu , Wei Bi , Haoli Bai , Lifeng Shang , Lingpeng Kong

Lost in Diffusion: Uncovering Hallucination Patterns and Failure Modes in Diffusion Large Language Models

While Diffusion Large Language Models (dLLMs) have emerged as a promising non-autoregressive paradigm comparable to autoregressive (AR) models, their faithfulness, specifically regarding hallucination, remains largely underexplored. To…

Computation and Language · Computer Science 2026-04-14 Zhengnan Guo , Fei Tan

Diffusion Large Language Models for Visual Speech Recognition

Existing Visual Speech Recognition (VSR) systems commonly rely on left-to-right autoregressive decoding, which can force premature decisions on visually ambiguous tokens before sufficient context is available. We propose DLLM-VSR, to the…

Artificial Intelligence · Computer Science 2026-05-28 Jeong Hun Yeo , Chae Won Kim , Hyeongseop Rha , Yong Man Ro

REvolution: An Evolutionary Framework for RTL Generation driven by Large Language Models

Large Language Models (LLMs) are used for Register-Transfer Level (RTL) code generation, but they face two main challenges: functional correctness and Power, Performance, and Area (PPA) optimization. Iterative, feedback-based methods…

Neural and Evolutionary Computing · Computer Science 2025-10-27 Kyungjun Min , Kyumin Cho , Junhwan Jang , Seokhyeong Kang

Diffusion Language Models Are Natively Length-Aware

Unlike autoregressive language models, which terminate variable-length generation upon predicting an End-of-Sequence (EoS) token, Diffusion Language Models (DLMs) operate over a fixed maximum-length context window for a predetermined number…

Computation and Language · Computer Science 2026-03-09 Vittorio Rossi , Giacomo Cirò , Davide Beltrame , Luca Gandolfi , Paul Röttger , Dirk Hovy

Breaking AR's Sampling Bottleneck: Provable Acceleration via Diffusion Language Models

Diffusion models have emerged as a powerful paradigm for modern generative modeling, demonstrating strong potential for large language models (LLMs). Unlike conventional autoregressive (AR) models that generate tokens sequentially,…

Machine Learning · Computer Science 2026-01-09 Gen Li , Changxiao Cai

Introspective Diffusion Language Models

Diffusion language models promise parallel generation, yet still lag behind autoregressive (AR) models in quality. We stem this gap to a failure of introspective consistency: AR models agree with their own generations, while DLMs often do…

Artificial Intelligence · Computer Science 2026-04-14 Yifan Yu , Yuqing Jian , Junxiong Wang , Zhongzhu Zhou , Donglin Zhuang , Xinyu Fang , Sri Yanamandra , Xiaoxia Wu , Qingyang Wu , Shuaiwen Leon Song , Tri Dao , Ben Athiwaratkun , James Zou , Fan Lai , Chenfeng Xu

DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models

Diffusion-based decoding has recently emerged as an appealing alternative to autoregressive (AR) generation, offering the potential to update multiple tokens in parallel and reduce latency. However, diffusion vision language models (dVLMs)…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Lunbin Zeng , Jingfeng Yao , Bencheng Liao , Hongyuan Tao , Wenyu Liu , Xinggang Wang

WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference

Autoregressive (AR) generation is the standard decoding paradigm for Large Language Models (LLMs), but its token-by-token nature limits parallelism at inference time. Diffusion Language Models (DLLMs) offer parallel decoding by recovering…

Computation and Language · Computer Science 2025-12-30 Aiwei Liu , Minghua He , Shaoxun Zeng , Sijun Zhang , Linhao Zhang , Chuhan Wu , Wei Jia , Yuan Liu , Xiao Zhou , Jie Zhou

Exploring the Power of Diffusion Large Language Models for Software Engineering: An Empirical Investigation

Autoregressive Large Language Models (AR-LLMs) are widely used in software engineering (SE) but face limitations in processing code structure information and suffer from high inference latency. Diffusion LLMs (DLLMs) offer a promising…

Software Engineering · Computer Science 2025-10-07 Jingyao Zhang , Tianlin Li , Xiaoyu Zhang , Qiang Hu , Bin Shi

VL Norm: Rethink Loss Aggregation in RLVR

We propose VL Norm (Variance-reduced Length-dependent Normalization), a simple yet effective loss aggregation method tailored to the characteristic of dynamic generation lengths in Reinforcement Learning with Verifiable Rewards (RLVR).…

Machine Learning · Computer Science 2025-10-14 Zhiyuan He , Xufang Luo , Yike Zhang , Yuqing Yang , Lili Qiu

Controlled Diversity: Length-optimized Natural Language Generation

LLMs are not generally able to adjust the length of their outputs based on strict length requirements, a capability that would improve their usefulness in applications that require adherence to diverse user and system requirements. We…

Computation and Language · Computer Science 2025-02-27 Diana Marie Schenke , Timo Baumann