English
Related papers

Related papers: Stability-Weighted Decoding for Diffusion Language…

200 papers

Diffusion Large Language Models (dLLMs) have achieved rapid progress, viewed as a promising alternative to the autoregressive paradigm. However, most dLLM decoders still adopt a global confidence threshold, and do not explicitly model local…

Computation and Language · Computer Science 2026-04-09 Yuzhe Chen , Jiale Cao , Xuyang Liu , Jin Xie , Aiping Yang , Yanwei Pang

Diffusion-based Large Language Models (dLLMs) have emerged as a competitive alternative to autoregressive models, offering unique advantages through bidirectional attention and parallel generation paradigms. However, the generation results…

Computation and Language · Computer Science 2025-10-07 Yifeng Gao , Ziang Ji , Yuxuan Wang , Biqing Qi , Hanlin Xu , Linfeng Zhang

Diffusion large language models (dLLMs) generate text by iteratively denoising masked token sequences. Although dLLMs can predict all masked positions in parallel within each step, the large number of denoising iterations still makes…

Computation and Language · Computer Science 2026-05-18 Shengyin Sun , Yiming Li , Renxi Liu , Xinqi Li , Hui-Ling Zhen , Weizhe Lin , Chen Chen , Xianzhi Yu , Mingxuan Yuan , Chen Ma

Diffusion large language models (dLLMs) generate text through iterative denoising, yet current decoding strategies discard rich intermediate predictions in favor of the final output. Our work here reveals a critical phenomenon, temporal…

Computation and Language · Computer Science 2025-10-07 Wen Wang , Bozhen Fang , Chenchen Jing , Yongliang Shen , Yangyi Shen , Qiuyu Wang , Hao Ouyang , Hao Chen , Chunhua Shen

Despite extensive efforts to align Large Language Models (LLMs) with human values and safety rules, jailbreak attacks that exploit certain vulnerabilities continuously emerge, highlighting the need to strengthen existing LLMs with…

Machine Learning · Computer Science 2025-09-30 Xuekang Wang , Shengyu Zhu , Xueqi Cheng

Diffusion large language models (dLLMs) generate text through iterative denoising. In commonly adopted parallel decoding schemes, each step confirms only high-confidence positions while remasking the others. By analyzing dLLM denoising…

Computation and Language · Computer Science 2026-05-27 Kangyu Wang , Zhiyun Jiang , Haibo Feng , Weijia Zhao , Lin Liu , Jianguo Li , Zhenzhong Lan , Weiyao Lin

Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to autoregressive generation by enabling parallel token prediction. However, practical dLLM decoding still suffers from high inference latency, which limits…

Computation and Language · Computer Science 2026-04-22 Zhenbang Du , Kejing Xia , Xinrui Zhong , Yonggan Fu , Nicolai Oswald , Binfei Ji , Brucek Khailany , Pavlo Molchanov , Yingyan Lin

Diffusion Language Models (DLMs) enable parallel decoding via iterative denoising, where remasking strategies play a critical role in balancing inference speed and output quality. Existing methods predominantly rely on static confidence…

Computation and Language · Computer Science 2026-02-24 Xinhao Sun , Huaijin Zhao , Maoliang Li , Zihao Zheng , Jiayu Chen , Yun Liang , Xiang Chen

Training stability is a persistent challenge in the pre-training of large language models (LLMs), particularly for architectures such as Post-Norm Transformers, which are prone to gradient explosion and dissipation. In this paper, we…

Computation and Language · Computer Science 2025-02-26 Ya Wang , Zhijian Zhuo , Yutao Zeng , Xun Zhou , Jian Yang , Xiaoqing Li

Speculative decoding accelerates large language model (LLM) inference by using a lightweight draft model to propose tokens that are later verified by a stronger target model. While effective in centralized systems, its behavior in…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-18 Jingwei Song , Wanyi Chen , Xinyuan Song , Max , Chris Tong , Gufeng Chen , Tianyi Zhao , Eric Yang , Bill Shi , Lynn Ai

Accelerating the inference of large language models (LLMs) has been a critical challenge in generative AI. Speculative decoding (SD) substantially improves LLM inference efficiency. However, its utility is limited by a fundamental…

Computation and Language · Computer Science 2026-05-05 Sibo Xiao , Jinyuan Fu , Zhongle Xie , Lidan Shou

Steering language model generation toward desired textual properties is essential for practical deployment, and inference-time methods are particularly appealing because they enable controllable generation without retraining. Recent work…

Computation and Language · Computer Science 2026-05-29 Hyeseon An , Yo-Sub Han

Diffusion Large Language Models (dLLMs) represent a new paradigm beyond autoregressive modeling, offering competitive performance while naturally enabling a flexible decoding process. Specifically, dLLMs can generate tokens at arbitrary…

Computation and Language · Computer Science 2026-02-13 Sicheng Feng , Zigeng Chen , Xinyin Ma , Gongfan Fang , Xinchao Wang

Large Language Models (LLMs) have demonstrated impressive performance on multiple-choice question answering (MCQA) benchmarks, yet they remain highly vulnerable to minor input perturbations. In this paper, we introduce and evaluate Token…

Computation and Language · Computer Science 2025-06-12 Jui-Ming Yao , Hao-Yuan Chen , Zi-Xian Tang , Bing-Jia Tan , Sheng-Wei Peng , Bing-Cheng Xie , Shun-Feng Su

Masked diffusion models (MDMs) offer a promising non-autoregressive alternative for large language modeling. Standard decoding methods for MDMs, such as confidence-based sampling, select tokens independently based on individual token…

Computation and Language · Computer Science 2025-09-23 Daehoon Gwak , Minseo Jung , Junwoo Park , Minho Park , ChaeHun Park , Junha Hyung , Jaegul Choo

Large language model (LLM) inference often suffers from high decoding latency and limited scalability across heterogeneous edge-cloud environments. Existing speculative decoding (SD) techniques accelerate token generation but remain…

Machine Learning · Computer Science 2025-12-02 Fengze Yu , Leshu Li , Brad McDanel , Sai Qian Zhang

Diffusion large language models (dLLMs) offer a promising paradigm for parallel text generation, but in practice they face an accuracy-parallelism trade-off, where increasing tokens per forward (TPF) often degrades generation quality.…

Computation and Language · Computer Science 2026-05-12 Haoyang Zhou , Li Kong , Shijie Ren , Xiting Wang , Shuang Liang , Guowei Wang , Zhenxuan Pan

Diffusion large language models (dLLMs) have recently emerged as a promising alternative to autoregressive (AR) models, offering advantages such as accelerated parallel decoding and bidirectional context modeling. However, the vanilla…

Computation and Language · Computer Science 2025-10-07 Runchu Tian , Junxia Cui , Xueqiang Xu , Feng Yao , Jingbo Shang

Diffusion models have shown promise in text generation, but often struggle with generating long, coherent, and contextually accurate text. Token-level diffusion doesn't model word-order dependencies explicitly and operates on short, fixed…

Computation and Language · Computer Science 2025-05-27 Xiaochen Zhu , Georgi Karadzhov , Chenxi Whitehouse , Andreas Vlachos

While LLM-based Automatic Speech Recognition (ASR) achieves high accuracy, its speed is limited by sequential autoregressive decoding. Diffusion Language Models (DLMs) offer a parallel alternative, yet their decoding strategies remain…

Audio and Speech Processing · Electrical Eng. & Systems 2026-05-29 Jeong Hun Yeo , Minsu Kim , Hyeongseop Rha , Yong Man Ro
‹ Prev 1 2 3 10 Next ›