Related papers: DICE: Diffusion Large Language Models Excel at Gen…

Divide and Conquer: Accelerating Diffusion-Based Large Language Models via Adaptive Parallel Decoding

Diffusion-based large language models (dLLMs) have shown promising performance across various reasoning tasks, establishing themselves as an alternative to autoregressive large language models (LLMs). Unlike autoregressive LLMs that…

Computation and Language · Computer Science 2026-03-02 Xiangzhong Luo , Yilin An , Zhicheng Yu , Weichen Liu , Xu Yang

DARE: Diffusion Large Language Models Alignment and Reinforcement Executor

Diffusion large language models (dLLMs) are emerging as a compelling alternative to dominant autoregressive models, replacing strictly sequential token generation with iterative denoising and parallel generation dynamics. However, their…

Computation and Language · Computer Science 2026-04-07 Jingyi Yang , Yuxian Jiang , Xuhao Hu , Shuang Cheng , Biqing Qi , Jing Shao

Scaling Diffusion Language Models via Adaptation from Autoregressive Models

Diffusion Language Models (DLMs) have emerged as a promising new paradigm for text generative modeling, potentially addressing limitations of autoregressive (AR) models. However, current DLMs have been studied at a smaller scale compared to…

Computation and Language · Computer Science 2025-06-03 Shansan Gong , Shivam Agarwal , Yizhe Zhang , Jiacheng Ye , Lin Zheng , Mukai Li , Chenxin An , Peilin Zhao , Wei Bi , Jiawei Han , Hao Peng , Lingpeng Kong

Diffuse Thinking: Exploring Diffusion Language Models as Efficient Thought Proposers for Reasoning

In recent years, large language models (LLMs) have witnessed remarkable advancements, with the test-time scaling law consistently enhancing the reasoning capabilities. Through systematic evaluation and exploration of a diverse spectrum of…

Computation and Language · Computer Science 2025-11-03 Chenyang Shao , Sijian Ren , Fengli Xu , Yong Li

Exploring the Power of Diffusion Large Language Models for Software Engineering: An Empirical Investigation

Autoregressive Large Language Models (AR-LLMs) are widely used in software engineering (SE) but face limitations in processing code structure information and suffer from high inference latency. Diffusion LLMs (DLLMs) offer a promising…

Software Engineering · Computer Science 2025-10-07 Jingyao Zhang , Tianlin Li , Xiaoyu Zhang , Qiang Hu , Bin Shi

Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing

Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to autoregressive (AR) LLMs for text generation, with the potential to decode multiple tokens in a single iteration. However, none of the existing open-source…

Machine Learning · Computer Science 2025-08-14 Xu Wang , Chenkai Xu , Yijie Jin , Jiachun Jin , Hao Zhang , Zhijie Deng

DICE: Structured Reasoning in LLMs through SLM-Guided Chain-of-Thought Correction

When performing reasoning tasks with user-specific requirements, such as strict output formats, large language models (LLMs) often prioritize reasoning over adherence to detailed instructions. Fine-tuning LLMs on supervised datasets to…

Computation and Language · Computer Science 2025-10-21 Yiqi Li , Yusheng Liao , Zhe Chen , Yanfeng Wang , Yu Wang

DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation

Diffusion large language models (dLLMs) are compelling alternatives to autoregressive (AR) models because their denoising models operate over the entire sequence. The global planning and iterative refinement features of dLLMs are…

Computation and Language · Computer Science 2025-06-27 Shansan Gong , Ruixiang Zhang , Huangjie Zheng , Jiatao Gu , Navdeep Jaitly , Lingpeng Kong , Yizhe Zhang

Discrete Diffusion in Large Language and Multimodal Models: A Survey

In this work, we provide a systematic survey of Discrete Diffusion Language Models (dLLMs) and Discrete Diffusion Multimodal Language Models (dMLLMs). Unlike autoregressive (AR) models, dLLMs and dMLLMs adopt a multi-token, parallel…

Machine Learning · Computer Science 2025-09-22 Runpeng Yu , Qi Li , Xinchao Wang

Residual Context Diffusion Language Models

Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to purely autoregressive language models because they can decode multiple tokens in parallel. However, state-of-the-art block-wise dLLMs rely on a "remasking"…

Computation and Language · Computer Science 2026-02-02 Yuezhou Hu , Harman Singh , Monishwaran Maheswaran , Haocheng Xi , Coleman Hooper , Jintao Zhang , Aditya Tomar , Michael W. Mahoney , Sewon Min , Mehrdad Farajtabar , Kurt Keutzer , Amir Gholami , Chenfeng Xu

DiffBench Meets DiffAgent: End-to-End LLM-Driven Diffusion Acceleration Code Generation

Diffusion models have achieved remarkable success in image and video generation. However, their inherently multiple step inference process imposes substantial computational overhead, hindering real-world deployment. Accelerating diffusion…

Computer Vision and Pattern Recognition · Computer Science 2026-01-07 Jiajun jiao , Haowei Zhu , Puyuan Yang , Jianghui Wang , Ji Liu , Ziqiong Liu , Dong Li , Yuejian Fang , Junhai Yong , Bin Wang , Emad Barsoum

d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning

Recent large language models (LLMs) have demonstrated strong reasoning capabilities that benefits from online reinforcement learning (RL). These capabilities have primarily been demonstrated within the left-to-right autoregressive (AR)…

Computation and Language · Computer Science 2025-06-04 Siyan Zhao , Devaansh Gupta , Qinqing Zheng , Aditya Grover

CD4LM: Consistency Distillation and aDaptive Decoding for Diffusion Language Models

Autoregressive large language models achieve strong results on many benchmarks, but decoding remains fundamentally latency-limited by sequential dependence on previously generated tokens. Diffusion language models (DLMs) promise parallel…

Computation and Language · Computer Science 2026-01-06 Yihao Liang , Ze Wang , Hao Chen , Ximeng Sun , Jialian Wu , Xiaodong Yu , Jiang Liu , Emad Barsoum , Zicheng Liu , Niraj K. Jha

CDLM: Consistency Diffusion Language Models For Faster Sampling

Diffusion Language Models (DLMs) offer a promising parallel generation paradigm but suffer from slow inference due to numerous refinement steps and the inability to use standard KV caching. We introduce CDLM (Consistency Diffusion Language…

Machine Learning · Computer Science 2026-02-23 Minseo Kim , Chenfeng Xu , Coleman Hooper , Harman Singh , Ben Athiwaratkun , Ce Zhang , Kurt Keutzer , Amir Gholami

A Survey on Diffusion Language Models

Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent…

Computation and Language · Computer Science 2025-12-08 Tianyi Li , Mingda Chen , Bowei Guo , Zhiqiang Shen

Chunk-Distilled Language Modeling

We introduce Chunk-Distilled Language Modeling (CD-LM), an approach to text generation that addresses two challenges in current large language models (LLMs): the inefficiency of token-level generation, and the difficulty of adapting to new…

Computation and Language · Computer Science 2025-01-03 Yanhong Li , Karen Livescu , Jiawei Zhou

Top 10 Open Challenges Steering the Future of Diffusion Language Model and Its Variants

The paradigm of Large Language Models (LLMs) is currently defined by auto-regressive (AR) architectures, which generate text through a sequential ``brick-by-brick'' process. Despite their success, AR models are inherently constrained by a…

Computation and Language · Computer Science 2026-01-21 Yunhe Wang , Kai Han , Huiling Zhen , Yuchuan Tian , Hanting Chen , Yongbing Huang , Yufei Cui , Yingte Shu , Shan Gao , Ismail Elezi , Roy Vaughan Miles , Songcen Xu , Feng Wen , Chao Xu , Sinan Zeng , Dacheng Tao

dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching

Autoregressive Models (ARMs) have long dominated the landscape of Large Language Models. Recently, a new paradigm has emerged in the form of diffusion-based Large Language Models (dLLMs), which generate text by iteratively denoising masked…

Machine Learning · Computer Science 2025-06-10 Zhiyuan Liu , Yicun Yang , Yaojie Zhang , Junjie Chen , Chang Zou , Qingyuan Wei , Shaobo Wang , Linfeng Zhang

DIFFA: Large Language Diffusion Models Can Listen and Understand

Recent advances in large language models (LLMs) have shown remarkable capabilities across textual and multimodal domains. In parallel, diffusion-based language models have emerged as a promising alternative to the autoregressive paradigm,…

Sound · Computer Science 2025-11-11 Jiaming Zhou , Hongjie Chen , Shiwan Zhao , Jian Kang , Jie Li , Enzhi Wang , Yujie Guo , Haoqin Sun , Hui Wang , Aobo Kong , Yong Qin , Xuelong Li

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

This paper presents LLaDA2.0 -- a tuple of discrete diffusion large language models (dLLM) scaling up to 100B total parameters through systematic conversion from auto-regressive (AR) models -- establishing a new paradigm for frontier-scale…

Machine Learning · Computer Science 2025-12-25 Tiwei Bie , Maosong Cao , Kun Chen , Lun Du , Mingliang Gong , Zhuochen Gong , Yanmei Gu , Jiaqi Hu , Zenan Huang , Zhenzhong Lan , Chengxi Li , Chongxuan Li , Jianguo Li , Zehuan Li , Huabin Liu , Lin Liu , Guoshan Lu , Xiaocheng Lu , Yuxin Ma , Jianfeng Tan , Lanning Wei , Ji-Rong Wen , Yipeng Xing , Xiaolu Zhang , Junbo Zhao , Da Zheng , Jun Zhou , Junlin Zhou , Zhanchao Zhou , Liwang Zhu , Yihong Zhuang