English
Related papers

Related papers: Progressive Refinement Regulation for Accelerating…

200 papers

Autoregressive (AR) models remain the standard for natural language generation but still suffer from high latency due to strictly sequential decoding. Recent diffusion-inspired approaches, such as LlaDA and Dream, mitigate this by…

Computation and Language · Computer Science 2025-10-16 Qinglin Zhu , Yizhen Yao , Runcong Zhao , Yanzheng Xiang , Amrutha Saseendran , Chen Jin , Philip Teare , Bin Liang , Yulan He , Lin Gui

Diffusion language models generate text through iterative refinement, a process that is often computationally inefficient because many tokens reach stability long before the final denoising step. We introduce a training-free, token-level…

Machine Learning · Computer Science 2026-02-12 Zahar Kohut , Severyn Shykula , Dmytro Khamula , Mykola Vysotskyi , Taras Rumezhak , Volodymyr Karpiv

Discrete diffusion models have emerged as a promising direction for vision-language tasks, offering bidirectional context modeling and theoretical parallelization. However, their practical application is severely hindered by a…

Computation and Language · Computer Science 2025-10-24 Yatai Ji , Teng Wang , Yuying Ge , Zhiheng Liu , Sidi Yang , Ying Shan , Ping Luo

Most large language models are autoregressive: they generate tokens one at a time. Discrete diffusion language models can generate multiple tokens in parallel, but sampling from them requires a denoising order: a strategy for deciding which…

Artificial Intelligence · Computer Science 2026-03-27 Daniel Israel , Tian Jin , Ellie Cheng , Guy Van den Broeck , Aditya Grover , Suvinay Subramanian , Michael Carbin

Diffusion-based large language models offer a non-autoregressive alternative for text generation, but enabling them to perform complex reasoning remains challenging. Reinforcement learning has recently emerged as an effective post-training…

Artificial Intelligence · Computer Science 2026-04-14 Shaoan Xie , Lingjing Kong , Xiangchen Song , Xinshuai Dong , Guangyi Chen , Eric P. Xing , Kun Zhang

Autoregressive (AR) language models generate text one token at a time, which limits their inference speed. Diffusion-based language models offer a promising alternative, as they can decode multiple tokens in parallel. However, we identify a…

Computation and Language · Computer Science 2025-10-27 Yeongbin Seo , Dongha Lee , Jaehyung Kim , Jinyoung Yeo

Diffusion models have been shown to implicitly generate visual content autoregressively in the frequency domain, where low-frequency components are generated earlier in the denoising process while high-frequency details emerge only in later…

Computer Vision and Pattern Recognition · Computer Science 2026-05-21 Howard Xiao , Brian Chao , Lior Yariv , Gordon Wetzstein

While Diffusion Language Models (DLMs) are theoretically well-suited for iterative refinement due to their non-causal structure, they often fail to reliably revise incorrect tokens in practice. The key challenge lies in the model's…

Machine Learning · Computer Science 2026-01-30 Shuibai Zhang , Fred Zhangzhi Peng , Yiheng Zhang , Jin Pan , Grigorios G. Chrysos

Decoding-based regression, which reformulates regression as a sequence generation task, has emerged as a promising paradigm of applying large language models for numerical prediction. However, its progress is hindered by the misalignment…

Machine Learning · Computer Science 2025-12-09 Ming Chen , Sheng Tang , Rong-Xi Tan , Ziniu Li , Jiacheng Chen , Ke Xue , Chao Qian

Diffusion models have achieved remarkable success in text-to-image generation. However, their practical applications are hindered by the misalignment between generated images and corresponding text prompts. To tackle this issue,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-28 Zijing Hu , Fengda Zhang , Long Chen , Kun Kuang , Jiahui Li , Kaifeng Gao , Jun Xiao , Xin Wang , Wenwu Zhu

We present a new paradigm for fine-tuning large-scale visionlanguage pre-trained models on downstream task, dubbed Prompt Regularization (ProReg). Different from traditional fine-tuning which easily overfits to the downstream task data,…

Computer Vision and Pattern Recognition · Computer Science 2025-08-14 Beier Zhu , Yulei Niu , Saeil Lee , Minhoe Hur , Hanwang Zhang

Discrete diffusion models have recently become competitive with autoregressive models for language modeling, even outperforming them on reasoning tasks requiring planning and global coherence, but they require more computation at inference…

Machine Learning · Computer Science 2026-02-04 Andre He , Sean Welleck , Daniel Fried

Discriminative pre-trained language models (PrLMs) can be generalized as denoising auto-encoders that work with two procedures, ennoising and denoising. First, an ennoising process corrupts texts with arbitrary noising functions to…

Computation and Language · Computer Science 2022-10-12 Zhuosheng Zhang , Hai Zhao , Ming Zhou

Diffusion models have emerged as a promising approach for text generation, with recent works falling into two main categories: discrete and continuous diffusion models. Discrete diffusion models apply token corruption independently using…

Computation and Language · Computer Science 2025-05-29 Bocheng Li , Zhujin Gao , Linli Xu

Diffusion language models offer a promising alternative to autoregressive models due to their global, non-causal generation process, but their continuous latent dynamics make discrete constraints -- e.g., the output should be a JSON file…

Machine Learning · Computer Science 2026-05-28 Jinwoo Kim , Taylor Berg-Kirkpatrick , Loris D'Antoni

Looped transformers scale computational depth without increasing parameter count by repeatedly applying a shared transformer block and can be used for iterative refinement, where each loop rewrites a full fixed-size prediction in parallel.…

Machine Learning · Computer Science 2026-04-22 Chris Cameron , Wangzheng Wang , Nikita Ivanov , Ashmita Bhattacharyya , Didier Chételat , Yingxue Zhang

Remote sensing semantic segmentation must address both what the ground objects are within an image and where they are located. Consequently, segmentation models must ensure not only the semantic correctness of large-scale patches…

Computer Vision and Pattern Recognition · Computer Science 2026-01-28 Hao Wang , Keyan Hu , Xin Guo , Haifeng Li , Chao Tao

Reinforcement learning (RL) with unit test feedback has enhanced large language models' (LLMs) code generation, but relies on sparse rewards provided only after complete code evaluation, limiting learning efficiency and incremental…

Artificial Intelligence · Computer Science 2025-02-05 Ning Dai , Zheng Wu , Renjie Zheng , Ziyun Wei , Wenlei Shi , Xing Jin , Guanlin Liu , Chen Dun , Liang Huang , Lin Yan

To fully leverage the capabilities of diffusion models, we are often interested in optimizing downstream reward functions during inference. While numerous algorithms for reward-guided generation have been recently proposed due to their…

Machine Learning · Computer Science 2025-04-18 Masatoshi Uehara , Xingyu Su , Yulai Zhao , Xiner Li , Aviv Regev , Shuiwang Ji , Sergey Levine , Tommaso Biancalani

We present DiffusionBERT, a new generative masked language model based on discrete diffusion models. Diffusion models and many pre-trained language models have a shared training objective, i.e., denoising, making it possible to combine the…

Computation and Language · Computer Science 2022-12-02 Zhengfu He , Tianxiang Sun , Kuanning Wang , Xuanjing Huang , Xipeng Qiu
‹ Prev 1 2 3 10 Next ›