中文
相关论文

相关论文: Corrective Diffusion Language Models

200 篇论文

Masked diffusion language models enable parallel token generation and offer improved decoding efficiency over autoregressive models. However, their performance degrades significantly when generating multiple tokens simultaneously, due to a…

计算与语言 · 计算机科学 2026-05-12 Houxing Ren , Mingjie Zhan , Zimu Lu , Ke Wang , Yunqiao Yang , Haotian Hou , Junting Pan , Hongsheng Li

Mask-based Diffusion Language Models (DLMs) struggle to revise incorrect tokens: once a token is generated, it typically remains fixed. The key challenge is to identify potential errors in the inputs. In this paper, we propose…

计算与语言 · 计算机科学 2025-09-30 Zemin Huang , Yuhang Wang , Zhiyang Chen , Guo-Jun Qi

Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive models for faster inference via parallel token generation. We provide a rigorous foundation for this advantage by formalizing a model of parallel…

机器学习 · 计算机科学 2026-01-01 Haozhe Jiang , Nika Haghtalab , Lijie Chen

Diffusion large language models (dLLMs) have recently emerged as a promising alternative to autoregressive (AR) models, offering advantages such as accelerated parallel decoding and bidirectional context modeling. However, the vanilla…

计算与语言 · 计算机科学 2025-10-07 Runchu Tian , Junxia Cui , Xueqiang Xu , Feng Yao , Jingbo Shang

Diffusion Language Models (DLMs) offer a promising alternative for language modeling by enabling parallel decoding through iterative refinement. However, most DLMs rely on hard binary masking and discrete token assignments, which hinder the…

计算与语言 · 计算机科学 2026-01-19 Linhao Zhong , Linyu Wu , Bozhen Fang , Tianjian Feng , Chenchen Jing , Wen Wang , Jiaheng Zhang , Hao Chen , Chunhua Shen

Masked diffusion models (MDMs) have emerged as a promising alternative to autoregressive models, enabling parallel token generation while achieving competitive performance. Despite these advantages, MDMs face a fundamental limitation: once…

Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent…

计算与语言 · 计算机科学 2025-12-08 Tianyi Li , Mingda Chen , Bowei Guo , Zhiqiang Shen

Diffusion (Large) Language Models (dLLMs) now match the downstream performance of their autoregressive counterparts on many tasks, while holding the promise of being more efficient during inference. One critical design aspect of dLLMs is…

Diffusion Language Models (DLMs) provide a promising alternative to autoregressive language models by generating text through iterative denoising and bidirectional refinement. However, this iterative generation paradigm also introduces…

计算与语言 · 计算机科学 2026-05-14 Yejin Lee , Yo-Sub Han

Discrete diffusion language models (dLLMs) provide a fast and flexible alternative to autoregressive models (ARMs) via iterative denoising with parallel updates. However, their evaluation is challenging: existing metrics conflate denoiser…

机器学习 · 计算机科学 2026-05-29 Luhan Tang , Longxuan Yu , Shaorong Zhang , Greg Ver Steeg

Diffusion language models (DLMs) have recently demonstrated capabilities that complement standard autoregressive (AR) models, particularly in non-sequential generation and bidirectional editing. Although recent work has shown that…

机器学习 · 计算机科学 2026-05-11 Fred Zhangzhi Peng , Alexis Fox , Anru R. Zhang , Alexander Tong

Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) models for language modeling, allowing flexible generation order and parallel generation of multiple tokens. However, this flexibility…

机器学习 · 计算机科学 2026-03-24 Changxiao Cai , Gen Li

Masked diffusion language models (MDLMs) have emerged as a promising alternative to dominant autoregressive approaches. Although they achieve competitive performance on several tasks, a substantial gap remains in open-ended text generation.…

计算与语言 · 计算机科学 2026-02-02 Mengyu Ye , Ryosuke Takahashi , Keito Kudo , Jun Suzuki

Diffusion language models promise parallel generation, yet still lag behind autoregressive (AR) models in quality. We stem this gap to a failure of introspective consistency: AR models agree with their own generations, while DLMs often do…

Discrete masked diffusion language models such as LLaDA generate text through iterative denoising, where mask tokens are progressively replaced with predicted tokens. LLaDA2.1 introduced a Token-to-Token (T2T) editing mechanism that…

计算与语言 · 计算机科学 2026-05-27 Lin Yao

Discrete diffusion models have emerged as a promising direction for vision-language tasks, offering bidirectional context modeling and theoretical parallelization. However, their practical application is severely hindered by a…

计算与语言 · 计算机科学 2025-10-24 Yatai Ji , Teng Wang , Yuying Ge , Zhiheng Liu , Sidi Yang , Ying Shan , Ping Luo

Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to purely autoregressive language models because they can decode multiple tokens in parallel. However, state-of-the-art block-wise dLLMs rely on a "remasking"…

Diffusion language models (DLMs) are an attractive alternative to autoregressive models because they promise sublinear-time, parallel generation, yet practical gains remain elusive as high-quality samples still demand hundreds of refinement…

机器学习 · 计算机科学 2026-05-04 Hasan Amin , Yuan Gao , Yaser Souri , Subhojit Som , Ming Yin , Rajiv Khanna , Xia Song

Diffusion Language Models (DLMs) offer a promising parallel generation paradigm but suffer from slow inference due to numerous refinement steps and the inability to use standard KV caching. We introduce CDLM (Consistency Diffusion Language…

机器学习 · 计算机科学 2026-02-23 Minseo Kim , Chenfeng Xu , Coleman Hooper , Harman Singh , Ben Athiwaratkun , Ce Zhang , Kurt Keutzer , Amir Gholami

While Masked Diffusion Language Models (MDLMs) relying on token masking and unmasking have shown promise in language modeling, their computational efficiency and generation flexibility remain constrained by the masking paradigm. In this…

计算与语言 · 计算机科学 2026-03-26 Fangyu Ding , Ding Ding , Sijin Chen , Kaibo Wang , Peng Xu , Zijin Feng , Haoli Bai , Kai Han , Youliang Yan , Binhang Yuan , Jiacheng Sun
‹ 上一页 1 2 3 10 下一页 ›