Related papers: NI Sampling: Accelerating Discrete Diffusion Sampl…

Attention-Based Sampler for Diffusion Language Models

Auto-regressive models (ARMs) have established a dominant paradigm in language modeling. However, their strictly sequential decoding paradigm imposes fundamental constraints on both inference efficiency and modeling flexibility. To address…

Computation and Language · Computer Science 2026-04-13 Yuyan Zhou , Kai Syun Hou , Weiyu Chen , James Kwok

Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles

Diffusion-based language models (dLLMs) have emerged as a promising alternative to traditional autoregressive LLMs by enabling parallel token generation and significantly reducing inference latency. However, existing sampling strategies for…

Computation and Language · Computer Science 2026-04-01 Qingyan Wei , Yaojie Zhang , Zhiyuan Liu , Puyu Zeng , Yuxuan Wang , Biqing Qi , Dongrui Liu , Linfeng Zhang

Is Your Diffusion Sampler Actually Correct? A Sampler-Centric Evaluation of Discrete Diffusion Language Models

Discrete diffusion language models (dLLMs) provide a fast and flexible alternative to autoregressive models (ARMs) via iterative denoising with parallel updates. However, their evaluation is challenging: existing metrics conflate denoiser…

Machine Learning · Computer Science 2026-05-29 Luhan Tang , Longxuan Yu , Shaorong Zhang , Greg Ver Steeg

Learnable Sampler Distillation for Discrete Diffusion Models

Discrete diffusion models (DDMs) have shown powerful generation ability for discrete data modalities like text and molecules. However, their practical application is hindered by inefficient sampling, requiring a large number of sampling…

Machine Learning · Computer Science 2025-09-25 Feiyang Fu , Tongxian Guo , Zhaoqiang Liu

Breaking AR's Sampling Bottleneck: Provable Acceleration via Diffusion Language Models

Diffusion models have emerged as a powerful paradigm for modern generative modeling, demonstrating strong potential for large language models (LLMs). Unlike conventional autoregressive (AR) models that generate tokens sequentially,…

Machine Learning · Computer Science 2026-01-09 Gen Li , Changxiao Cai

Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time

Discrete diffusion models have emerged as powerful tools for high-quality data generation. Despite their success in discrete spaces, such as text generation tasks, the acceleration of discrete diffusion models remains under-explored. In…

Machine Learning · Computer Science 2024-12-09 Zixiang Chen , Huizhuo Yuan , Yongqian Li , Yiwen Kou , Junkai Zhang , Quanquan Gu

MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization

Recent advances in diffusion language models (DLMs) have presented a promising alternative to traditional autoregressive large language models (LLMs). However, DLMs still lag behind LLMs in reasoning performance, especially as the number of…

Computation and Language · Computer Science 2025-10-27 Chenglong Wang , Yang Gan , Hang Zhou , Chi Hu , Yongyu Mu , Kai Song , Murun Yang , Bei Li , Chunliang Zhang , Tongran Liu , Jingbo Zhu , Zhengtao Yu , Tong Xiao

Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

Diffusion models (DMs) have established themselves as the state-of-the-art generative modeling approach in the visual domain and beyond. A crucial drawback of DMs is their slow sampling speed, relying on many sequential function evaluations…

Computer Vision and Pattern Recognition · Computer Science 2024-04-24 Amirmojtaba Sabour , Sanja Fidler , Karsten Kreis

FlashDLM: Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion

Diffusion language models offer parallel token generation and inherent bidirectionality, promising more efficient and powerful sequence modeling compared to autoregressive approaches. However, state-of-the-art diffusion models (e.g., Dream…

Computation and Language · Computer Science 2025-10-10 Zhanqiu Hu , Jian Meng , Yash Akhauri , Mohamed S. Abdelfattah , Jae-sun Seo , Zhiru Zhang , Udit Gupta

DyLLM: Efficient Diffusion LLM Inference via Saliency-based Token Selection and Partial Attention

Masked Diffusion Language Models (MDLMs) enable parallel token decoding, providing a promising alternative to the sequential nature of autoregressive generation. However, their iterative denoising process remains computationally expensive…

Computation and Language · Computer Science 2026-03-10 Younjoo Lee , Junghoo Lee , Seungkyun Dan , Jaiyoung Park , Jung Ho Ahn

A Survey on Diffusion Language Models

Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent…

Computation and Language · Computer Science 2025-12-08 Tianyi Li , Mingda Chen , Bowei Guo , Zhiqiang Shen

CDLM: Consistency Diffusion Language Models For Faster Sampling

Diffusion Language Models (DLMs) offer a promising parallel generation paradigm but suffer from slow inference due to numerous refinement steps and the inability to use standard KV caching. We introduce CDLM (Consistency Diffusion Language…

Machine Learning · Computer Science 2026-02-23 Minseo Kim , Chenfeng Xu , Coleman Hooper , Harman Singh , Ben Athiwaratkun , Ce Zhang , Kurt Keutzer , Amir Gholami

Beyond Autoregression: Fast LLMs via Self-Distillation Through Time

Autoregressive (AR) Large Language Models (LLMs) have demonstrated significant success across numerous tasks. However, the AR modeling paradigm presents certain limitations; for instance, contemporary autoregressive LLMs are trained to…

Machine Learning · Computer Science 2025-02-10 Justin Deschenaux , Caglar Gulcehre

DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation

Despite their growing capabilities, language models still frequently reproduce content from their training data, generate repetitive text, and favor common grammatical patterns and vocabulary. A possible cause is the decoding strategy: the…

Computation and Language · Computer Science 2026-01-15 Giorgio Franceschelli , Mirco Musolesi

Sampling-Aware Quantization for Diffusion Models

Diffusion models have recently emerged as the dominant approach in visual generation tasks. However, the lengthy denoising chains and the computationally intensive noise estimation networks hinder their applicability in low-latency and…

Computer Vision and Pattern Recognition · Computer Science 2026-04-23 Qian Zeng , Jie Song , Yuanyu Wan , Huiqiong Wang , Mingli Song

Improving Diffusion Language Model Decoding through Joint Search in Generation Order and Token Space

Diffusion Language Models (DLMs) offer order-agnostic generation that can explore many possible decoding trajectories. However, current decoding methods commit to a single trajectory, limiting exploration in trajectory space. We introduce…

Computation and Language · Computer Science 2026-02-06 Yangyi Shen , Tianjian Feng , Jiaqi Han , Wen Wang , Tianlang Chen , Chunhua Shen , Jure Leskovec , Stefano Ermon

Are First-Order Diffusion Samplers Really Slower? A Fast Forward-Value Approach

Higher-order ODE solvers have become a standard tool for accelerating diffusion probabilistic model (DPM) sampling, motivating the widespread view that first-order methods are inherently slower and that increasing discretization order is…

Machine Learning · Statistics 2026-01-01 Yuchen Jiao , Na Li , Changxiao Cai , Gen Li

Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models

Large Language Models (LLMs) have achieved state-of-the-art performance on a broad range of Natural Language Processing (NLP) tasks, including document processing and code generation. Autoregressive Language Models (ARMs), which generate…

Machine Learning · Computer Science 2025-12-16 Minseo Kim , Coleman Hooper , Aditya Tomar , Chenfeng Xu , Mehrdad Farajtabar , Michael W. Mahoney , Kurt Keutzer , Amir Gholami

Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model

Diffusion language models (DLMs) are emerging as a compelling alternative to the dominant autoregressive paradigm, offering inherent advantages in parallel generation and bidirectional context modeling. However, for the tasks with strict…

Artificial Intelligence · Computer Science 2026-04-30 Yihong Dong , Zhaoyu Ma , Xue Jiang , Zhiyuan Fan , Jiaru Qian , Yongmin Li , Jianha Xiao , Zhi Jin , Rongyu Cao , Binhua Li , Fei Huang , Yongbin Li , Ge Li

Accelerating Guided Diffusion Sampling with Splitting Numerical Methods

Guided diffusion is a technique for conditioning the output of a diffusion model at sampling time without retraining the network for each specific task. One drawback of diffusion models, however, is their slow sampling process. Recent…

Computer Vision and Pattern Recognition · Computer Science 2023-01-30 Suttisak Wizadwongsa , Supasorn Suwajanakorn