English
Related papers

Related papers: Attention-Based Sampler for Diffusion Language Mod…

200 papers

Autoregressive (AR) generation is the standard decoding paradigm for Large Language Models (LLMs), but its token-by-token nature limits parallelism at inference time. Diffusion Language Models (DLLMs) offer parallel decoding by recovering…

Computation and Language · Computer Science 2025-12-30 Aiwei Liu , Minghua He , Shaoxun Zeng , Sijun Zhang , Linhao Zhang , Chuhan Wu , Wei Jia , Yuan Liu , Xiao Zhou , Jie Zhou

Large Language Models (LLMs) have achieved state-of-the-art performance on a broad range of Natural Language Processing (NLP) tasks, including document processing and code generation. Autoregressive Language Models (ARMs), which generate…

Discrete diffusion language models (dLLMs) have recently emerged as a promising alternative to traditional autoregressive approaches, offering the flexibility to generate tokens in arbitrary orders and the potential of parallel decoding.…

Machine Learning · Computer Science 2026-04-28 Enshu Liu , Xuefei Ning , Yu Wang , Zinan Lin

Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) models for language modeling, allowing flexible generation order and parallel generation of multiple tokens. However, this flexibility…

Machine Learning · Computer Science 2026-03-24 Changxiao Cai , Gen Li

Masked Diffusion Language Models (MDLMs) enable parallel token decoding, providing a promising alternative to the sequential nature of autoregressive generation. However, their iterative denoising process remains computationally expensive…

Computation and Language · Computer Science 2026-03-10 Younjoo Lee , Junghoo Lee , Seungkyun Dan , Jaiyoung Park , Jung Ho Ahn

Diffusion Language Models (DLMs) enable globally coherent, bidirectional, and controllable text generation, offering advantages over traditional autoregressive LLMs, while scaling to ultra-long sequences remains costly. Many existing…

Computer Vision and Pattern Recognition · Computer Science 2026-05-20 Wenhu Zhang , Yiming Wu , Huanyu Wang , Yaoyang Liu , Huanzhang Dou , Senqiao Yang , Sitong Wu , Hanbin Zhao , Jiaya Jia

Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent…

Computation and Language · Computer Science 2025-12-08 Tianyi Li , Mingda Chen , Bowei Guo , Zhiqiang Shen

Discrete diffusion language models (dLLMs) provide a fast and flexible alternative to autoregressive models (ARMs) via iterative denoising with parallel updates. However, their evaluation is challenging: existing metrics conflate denoiser…

Machine Learning · Computer Science 2026-05-29 Luhan Tang , Longxuan Yu , Shaorong Zhang , Greg Ver Steeg

Diffusion Large Language Models (DLLMs) offer a compelling alternative to Auto-Regressive models, but their deployment is constrained by high decoding cost. In this work, we identify a key inefficiency in DLLM decoding: while computation is…

Machine Learning · Computer Science 2026-02-02 Kaihua Liang , Xin Tan , An Zhong , Hong Xu , Marco Canini

Diffusion-based language models (dLLMs) have emerged as a promising alternative to autoregressive language models, offering the potential for parallel token generation and bidirectional context modeling. However, harnessing this flexibility…

Computation and Language · Computer Science 2026-05-28 Jiyeon Kim , Sungik Choi , Yongrae Jo , Moontae Lee , Minjoon Seo

The generation speed of LLMs are bottlenecked by autoregressive decoding, where tokens are predicted sequentially one by one. Alternatively, diffusion large language models (dLLMs) theoretically allow for parallel token generation, but in…

Computation and Language · Computer Science 2025-11-03 Daniel Israel , Guy Van den Broeck , Aditya Grover

Autoregressive Models (ARMs) have long dominated the landscape of Large Language Models. Recently, a new paradigm has emerged in the form of diffusion-based Large Language Models (dLLMs), which generate text by iteratively denoising masked…

Machine Learning · Computer Science 2025-06-10 Zhiyuan Liu , Yicun Yang , Yaojie Zhang , Junjie Chen , Chang Zou , Qingyuan Wei , Shaobo Wang , Linfeng Zhang

In this work, we provide a systematic survey of Discrete Diffusion Language Models (dLLMs) and Discrete Diffusion Multimodal Language Models (dMLLMs). Unlike autoregressive (AR) models, dLLMs and dMLLMs adopt a multi-token, parallel…

Machine Learning · Computer Science 2025-09-22 Runpeng Yu , Qi Li , Xinchao Wang

Automatic speech recognition (ASR) systems based on large language models (LLMs) achieve superior performance by leveraging pretrained LLMs as decoders, but their token-by-token generation mechanism leads to inference latency that grows…

Sound · Computer Science 2026-01-27 Wenjie Tian , Bingshen Mu , Guobin Ma , Xuelong Geng , Zhixian Zhao , Lei Xie

Autoregressive (AR) large language models (LLMs) have achieved remarkable performance across a wide range of natural language tasks, yet their inherent sequential decoding limits inference efficiency. In this work, we propose Fast-dLLM v2,…

Computation and Language · Computer Science 2025-10-01 Chengyue Wu , Hao Zhang , Shuchen Xue , Shizhe Diao , Yonggan Fu , Zhijian Liu , Pavlo Molchanov , Ping Luo , Song Han , Enze Xie

Masked Diffusion Language Models (DLMs) have recently emerged as a promising alternative to traditional Autoregressive Models (ARMs). DLMs employ transformer encoders with bidirectional attention, enabling parallel token generation while…

Computation and Language · Computer Science 2025-12-11 Maximo Eduardo Rulli , Simone Petruzzi , Edoardo Michielon , Fabrizio Silvestri , Simone Scardapane , Alessio Devoto

In arbitrary-order language models, it is an open question how to sample tokens in parallel from the correct joint distribution. With discrete diffusion models, the more tokens they generate in parallel, the less their predicted…

Machine Learning · Computer Science 2025-04-30 Gabe Guo , Stefano Ermon

Diffusion Language Models (DLMs) offer order-agnostic generation that can explore many possible decoding trajectories. However, current decoding methods commit to a single trajectory, limiting exploration in trajectory space. We introduce…

Computation and Language · Computer Science 2026-02-06 Yangyi Shen , Tianjian Feng , Jiaqi Han , Wen Wang , Tianlang Chen , Chunhua Shen , Jure Leskovec , Stefano Ermon

Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive models for faster inference via parallel token generation. We provide a rigorous foundation for this advantage by formalizing a model of parallel…

Machine Learning · Computer Science 2026-01-01 Haozhe Jiang , Nika Haghtalab , Lijie Chen

Diffusion-based large language models (dLLMs) are gaining attention for their inherent capacity for parallel decoding, offering a compelling alternative to autoregressive LLMs. Among various decoding strategies, block-wise…

Machine Learning · Computer Science 2026-03-03 Guanxi Lu , Hao Mark Chen , Yuto Karashima , Zhican Wang , Daichi Fujiki , Hongxiang Fan
‹ Prev 1 2 3 10 Next ›