English
Related papers

Related papers: Improving Variable-Length Generation in Diffusion …

200 papers

Diffusion Language Models (DLMs) present a compelling alternative to autoregressive models, offering flexible, any-order infilling without specialized prompting design. However, their practical utility is blocked by a critical limitation:…

Computation and Language · Computer Science 2026-02-03 Zirui Wu , Lin Zheng , Zhihui Xie , Jiacheng Ye , Jiahui Gao , Shansan Gong , Yansong Feng , Zhenguo Li , Wei Bi , Guorui Zhou , Lingpeng Kong

Diffusion-based large language models (dLLMs) have exhibited substantial potential for parallel text generation, which may enable more efficient generation compared to autoregressive models. However, current dLLMs suffer from fixed…

Computation and Language · Computer Science 2025-10-29 Yicun Yang , Cong Wang , Shaobo Wang , Zichen Wen , Biqing Qi , Hanlin Xu , Linfeng Zhang

Diffusion language models (DLMs) provide a bidirectional generation framework naturally suited for infilling, yet their performance is constrained by the pre-specified infilling length. In this paper, we reveal that DLMs possess an inherent…

Machine Learning · Computer Science 2026-02-03 Hengchang Liu , Zhao Yang , Bing Su

Diffusion Large Language Models (DLLMs) are emerging as a powerful alternative to the dominant Autoregressive Large Language Models, offering efficient parallel generation and capable global context modeling. However, the practical…

Computation and Language · Computer Science 2025-08-19 Jinsong Li , Xiaoyi Dong , Yuhang Zang , Yuhang Cao , Jiaqi Wang , Dahua Lin

While Diffusion Language Models (DLMs) are theoretically well-suited for iterative refinement due to their non-causal structure, they often fail to reliably revise incorrect tokens in practice. The key challenge lies in the model's…

Machine Learning · Computer Science 2026-01-30 Shuibai Zhang , Fred Zhangzhi Peng , Yiheng Zhang , Jin Pan , Grigorios G. Chrysos

Diffusion language models (DLMs) have recently emerged as a promising alternative to autoregressive models, primarily due to their ability to enable parallel decoding. Despite this advantage, most existing DLMs rely on a fixed generation…

Machine Learning · Computer Science 2026-05-12 Bian Sun , Kevin Zhai , Mubarak Shah , Zhenyi Wang

Large language models (LLMs) inherently operate over a large generation space, yet conventional usage typically reports the most likely generation (MLG) as a point prediction, which underestimates the model's capability: although the…

Computation and Language · Computer Science 2026-03-25 Ye Li , Anqi Hu , Yuanchang Ye , Shiyan Tong , Zhiyuan Wang , Bo Fu

Diffusion large language models (dLLMs) have recently attracted significant attention for their ability to enhance diversity, controllability, and parallelism. However, their non-sequential, bidirectionally masked generation makes quality…

Computation and Language · Computer Science 2026-03-04 Linhao Zhong , Linyu Wu , Wen Wang , Yuling Xi , Chenchen Jing , Jiaheng Zhang , Hao Chen , Chunhua Shen

While autoregressive Large Vision-Language Models (VLMs) have achieved remarkable success, their sequential generation often limits their efficacy in complex visual planning and dynamic robotic control. In this work, we investigate the…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Jiacheng Ye , Shansan Gong , Jiahui Gao , Junming Fan , Shuang Wu , Wei Bi , Haoli Bai , Lifeng Shang , Lingpeng Kong

While Diffusion Large Language Models (dLLMs) have emerged as a promising non-autoregressive paradigm comparable to autoregressive (AR) models, their faithfulness, specifically regarding hallucination, remains largely underexplored. To…

Computation and Language · Computer Science 2026-04-14 Zhengnan Guo , Fei Tan

Existing Visual Speech Recognition (VSR) systems commonly rely on left-to-right autoregressive decoding, which can force premature decisions on visually ambiguous tokens before sufficient context is available. We propose DLLM-VSR, to the…

Artificial Intelligence · Computer Science 2026-05-28 Jeong Hun Yeo , Chae Won Kim , Hyeongseop Rha , Yong Man Ro

Large Language Models (LLMs) are used for Register-Transfer Level (RTL) code generation, but they face two main challenges: functional correctness and Power, Performance, and Area (PPA) optimization. Iterative, feedback-based methods…

Neural and Evolutionary Computing · Computer Science 2025-10-27 Kyungjun Min , Kyumin Cho , Junhwan Jang , Seokhyeong Kang

Unlike autoregressive language models, which terminate variable-length generation upon predicting an End-of-Sequence (EoS) token, Diffusion Language Models (DLMs) operate over a fixed maximum-length context window for a predetermined number…

Computation and Language · Computer Science 2026-03-09 Vittorio Rossi , Giacomo Cirò , Davide Beltrame , Luca Gandolfi , Paul Röttger , Dirk Hovy

Diffusion models have emerged as a powerful paradigm for modern generative modeling, demonstrating strong potential for large language models (LLMs). Unlike conventional autoregressive (AR) models that generate tokens sequentially,…

Machine Learning · Computer Science 2026-01-09 Gen Li , Changxiao Cai

Diffusion language models promise parallel generation, yet still lag behind autoregressive (AR) models in quality. We stem this gap to a failure of introspective consistency: AR models agree with their own generations, while DLMs often do…

Diffusion-based decoding has recently emerged as an appealing alternative to autoregressive (AR) generation, offering the potential to update multiple tokens in parallel and reduce latency. However, diffusion vision language models (dVLMs)…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Lunbin Zeng , Jingfeng Yao , Bencheng Liao , Hongyuan Tao , Wenyu Liu , Xinggang Wang

Autoregressive (AR) generation is the standard decoding paradigm for Large Language Models (LLMs), but its token-by-token nature limits parallelism at inference time. Diffusion Language Models (DLLMs) offer parallel decoding by recovering…

Computation and Language · Computer Science 2025-12-30 Aiwei Liu , Minghua He , Shaoxun Zeng , Sijun Zhang , Linhao Zhang , Chuhan Wu , Wei Jia , Yuan Liu , Xiao Zhou , Jie Zhou

Autoregressive Large Language Models (AR-LLMs) are widely used in software engineering (SE) but face limitations in processing code structure information and suffer from high inference latency. Diffusion LLMs (DLLMs) offer a promising…

Software Engineering · Computer Science 2025-10-07 Jingyao Zhang , Tianlin Li , Xiaoyu Zhang , Qiang Hu , Bin Shi

We propose VL Norm (Variance-reduced Length-dependent Normalization), a simple yet effective loss aggregation method tailored to the characteristic of dynamic generation lengths in Reinforcement Learning with Verifiable Rewards (RLVR).…

Machine Learning · Computer Science 2025-10-14 Zhiyuan He , Xufang Luo , Yike Zhang , Yuqing Yang , Lili Qiu

LLMs are not generally able to adjust the length of their outputs based on strict length requirements, a capability that would improve their usefulness in applications that require adherence to diverse user and system requirements. We…

Computation and Language · Computer Science 2025-02-27 Diana Marie Schenke , Timo Baumann
‹ Prev 1 2 3 10 Next ›