Related papers: Optimizing Decoding Paths in Masked Diffusion Mode…

Masked Diffusion Models are Secretly Learned-Order Autoregressive Models

Masked Diffusion Models (MDMs) have emerged as one of the most promising paradigms for generative modeling over discrete domains. It is known that MDMs effectively train to decode tokens in a random order, and that this ordering has…

Machine Learning · Computer Science 2025-11-25 Prateek Garg , Bhavya Kohli , Sunita Sarawagi

Entropy-driven Sampling and Training Scheme for Conditional Diffusion Generation

Denoising Diffusion Probabilistic Model (DDPM) is able to make flexible conditional image generation from prior noise to real data, by introducing an independent noise-aware classifier to provide conditional gradient guidance at each time…

Computer Vision and Pattern Recognition · Computer Science 2022-09-21 Shengming Li , Guangcong Zheng , Hui Wang , Taiping Yao , Yang Chen , Shoudong Ding , Xi Li

Timestep-Aware Block Masking for Efficient Diffusion Model Inference

Diffusion Probabilistic Models (DPMs) have achieved great success in image generation but suffer from high inference latency due to their iterative denoising nature. Motivated by the evolving feature dynamics across the denoising…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Haodong He , Yuan Gao , Weizhong Zhang , Gui-Song Xia

Generation Order and Parallel Decoding in Masked Diffusion Models: An Information-Theoretic Perspective

Masked Diffusion Models (MDMs) significantly accelerate inference by trading off sequential determinism. However, the theoretical mechanisms governing generation order and the risks inherent in parallelization remain under-explored. In this…

Machine Learning · Computer Science 2026-02-03 Shaorong Zhang , Longxuan Yu , Rob Brekelmans , Luhan Tang , Salman Asif , Greg Ver Steeg

Deep Out-of-Distribution Uncertainty Quantification via Weight Entropy Maximization

This paper deals with uncertainty quantification and out-of-distribution detection in deep learning using Bayesian and ensemble methods. It proposes a practical solution to the lack of prediction diversity observed recently for standard…

Machine Learning · Computer Science 2025-02-03 Antoine de Mathelin , François Deheeger , Mathilde Mougeot , Nicolas Vayatis

EVODiff: Entropy-aware Variance Optimized Diffusion Inference

Diffusion models (DMs) excel in image generation but suffer from slow inference and training-inference discrepancies. Although gradient-based solvers for DMs accelerate denoising inference, they often lack theoretical foundations in…

Computer Vision and Pattern Recognition · Computer Science 2026-02-04 Shigui Li , Wei Chen , Delu Zeng

Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies

Masked diffusion models (MDMs) have recently emerged as a novel framework for language modeling. MDMs generate sentences by iteratively denoising masked sequences, filling in [MASK] tokens step by step. Although MDMs support any-order…

Machine Learning · Computer Science 2026-02-27 Chunsan Hong , Seonho An , Min-Soo Kim , Jong Chul Ye

Think Twice Before You Write -- an Entropy-based Decoding Strategy to Enhance LLM Reasoning

Decoding strategies play a central role in shaping the reasoning ability of large language models (LLMs). Traditional methods such as greedy decoding and beam search often suffer from error propagation, while sampling-based approaches…

Computation and Language · Computer Science 2026-04-02 Jiashu He , Meizhu Liu , Olaitan P Olaleye , Amit Agarwal , M. Avendi , Yassi Abbasi , Matthew Rowe , Hitesh Laxmichand Patel , Paul Li , Tao Sheng , Sujith Ravi , Dan Roth

Path Planning for Masked Diffusion Model Sampling

Any order generation of discrete data using masked diffusion models (MDMs) offers a compelling alternative to traditional autoregressive models, especially in domains that lack a natural causal ordering of data. However, current popular…

Machine Learning · Computer Science 2026-03-06 Fred Zhangzhi Peng , Zachary Bezemek , Sawan Patel , Jarrid Rector-Brooks , Sherwood Yao , Avishek Joey Bose , Alexander Tong , Pranam Chatterjee

Co-GRPO: Co-Optimized Group Relative Policy Optimization for Masked Diffusion Model

Recently, Masked Diffusion Models (MDMs) have shown promising potential across vision, language, and cross-modal generation. However, a notable discrepancy exists between their training and inference procedures. In particular, MDM inference…

Machine Learning · Computer Science 2025-12-30 Renping Zhou , Zanlin Ni , Tianyi Chen , Zeyu Liu , Yang Yue , Yulin Wang , Yuxuan Wang , Jingshu Liu , Gao Huang

Denoising diffusion models for out-of-distribution detection

Out-of-distribution detection is crucial to the safe deployment of machine learning systems. Currently, unsupervised out-of-distribution detection is dominated by generative-based approaches that make use of estimates of the likelihood or…

Machine Learning · Computer Science 2023-04-24 Mark S. Graham , Walter H. L. Pinaya , Petru-Daniel Tudosiu , Parashkev Nachev , Sebastien Ourselin , M. Jorge Cardoso

Learn from Your Mistakes: Self-Correcting Masked Diffusion Models

Masked diffusion models (MDMs) have emerged as a promising alternative to autoregressive models, enabling parallel token generation while achieving competitive performance. Despite these advantages, MDMs face a fundamental limitation: once…

Machine Learning · Computer Science 2026-03-06 Yair Schiff , Omer Belhasin , Roy Uziel , Guanghan Wang , Marianne Arriola , Gilad Turok , Michael Elad , Volodymyr Kuleshov

Error Bounds and Optimal Schedules for Masked Diffusions with Factorized Approximations

Recently proposed generative models for discrete data, such as Masked Diffusion Models (MDMs), exploit conditional independence approximations to reduce the computational cost of popular Auto-Regressive Models (ARMs), at the price of some…

Machine Learning · Statistics 2025-12-18 Hugo Lavenant , Giacomo Zanella

Lookahead Unmasking Elicits Accurate Decoding in Diffusion Language Models

Masked Diffusion Models (MDMs) as language models generate by iteratively unmasking tokens, yet their performance crucially depends on the inference time order of unmasking. Prevailing heuristics, such as confidence based sampling, are…

Machine Learning · Computer Science 2025-11-11 Sanghyun Lee , Seungryong Kim , Jongho Park , Dongmin Park

The Missing U for Efficient Diffusion Models

Diffusion Probabilistic Models stand as a critical tool in generative modelling, enabling the generation of complex data distributions. This family of generative models yields record-breaking performance in tasks such as image synthesis,…

Machine Learning · Computer Science 2024-04-08 Sergio Calvo-Ordonez , Chun-Wun Cheng , Jiahao Huang , Lipei Zhang , Guang Yang , Carola-Bibiane Schonlieb , Angelica I Aviles-Rivero

Entropy-informed Decoding: Adaptive Information-Driven Branching

Large language models (LLMs) achieve remarkable generative performance, yet their output quality is dependent on the decoding strategy. While sampling-based methods (e.g., top-k, nucleus) and search-and-select based methods (e.g., beam…

Machine Learning · Computer Science 2026-05-12 Benjamin Patrick Evans , Sumitra Ganesh , Leo Ardon

Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling

Masked diffusion language models (MDMs) have recently gained traction as a viable generative framework for natural language. This can be attributed to its scalability and ease of training compared to other diffusion model paradigms for…

Computation and Language · Computer Science 2025-08-19 Tejomay Kishor Padole , Suyash P Awate , Pushpak Bhattacharyya

Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction

Generative modeling of discrete data underlies important applications spanning text-based agents like ChatGPT to the design of the very building blocks of life in protein sequences. However, application domains need to exert control over…

Machine Learning · Computer Science 2024-10-11 Jarrid Rector-Brooks , Mohsin Hasan , Zhangzhi Peng , Zachary Quinn , Chenghao Liu , Sarthak Mittal , Nouha Dziri , Michael Bronstein , Yoshua Bengio , Pranam Chatterjee , Alexander Tong , Avishek Joey Bose

On Inference Stability for Diffusion Models

Denoising Probabilistic Models (DPMs) represent an emerging domain of generative models that excel in generating diverse and high-quality images. However, most current training methods for DPMs often neglect the correlation between…

Computer Vision and Pattern Recognition · Computer Science 2024-02-01 Viet Nguyen , Giang Vu , Tung Nguyen Thanh , Khoat Than , Toan Tran

Test-Time Scaling of Diffusion Models via Noise Trajectory Search

The iterative and stochastic nature of diffusion models enables test-time scaling, whereby spending additional compute during denoising generates higher-fidelity samples. Increasing the number of denoising steps is the primary scaling axis,…

Machine Learning · Computer Science 2025-09-09 Vignav Ramesh , Morteza Mardani