Related papers: Diffusion Alignment as Variational Expectation-Max…

Dynamic Search for Inference-Time Alignment in Diffusion Models

Diffusion models have shown promising generative capabilities across diverse domains, yet aligning their outputs with desired reward functions remains a challenge, particularly in cases where reward functions are non-differentiable. Some…

Machine Learning · Computer Science 2025-06-04 Xiner Li , Masatoshi Uehara , Xingyu Su , Gabriele Scalia , Tommaso Biancalani , Aviv Regev , Sergey Levine , Shuiwang Ji

HyperAlign: Hypernetwork for Efficient Test-Time Alignment of Diffusion Models

Diffusion model alignment aims to bridge the gap between generated outputs and human preferences by enhancing both semantic consistency with textual prompts and overall visual quality. Existing alignment methods face a challenging…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Xin Xie , Jiaxian Guo , Dong Gong

Inference-Time Alignment of Diffusion Models with Direct Noise Optimization

In this work, we focus on the alignment problem of diffusion models with a continuous reward function, which represents specific objectives for downstream tasks, such as increasing darkness or improving the aesthetics of images. The central…

Machine Learning · Computer Science 2024-10-03 Zhiwei Tang , Jiangweizhi Peng , Jiasheng Tang , Mingyi Hong , Fan Wang , Tsung-Hui Chang

Alignment of Diffusion Model and Flow Matching for Text-to-Image Generation

Diffusion models and flow matching have demonstrated remarkable success in text-to-image generation. While many existing alignment methods primarily focus on fine-tuning pre-trained generative models to maximize a given reward function,…

Machine Learning · Statistics 2026-02-03 Yidong Ouyang , Liyan Xie , Hongyuan Zha , Guang Cheng

Maximize Your Diffusion: A Study into Reward Maximization and Alignment for Diffusion-based Control

Diffusion-based planning, learning, and control methods present a promising branch of powerful and expressive decision-making solutions. Given the growing interest, such methods have undergone numerous refinements over the past years.…

Machine Learning · Computer Science 2025-02-19 Dom Huh , Prasant Mohapatra

Non-differentiable Reward Optimization for Diffusion-based Autonomous Motion Planning

Safe and effective motion planning is crucial for autonomous robots. Diffusion models excel at capturing complex agent interactions, a fundamental aspect of decision-making in dynamic environments. Recent studies have successfully applied…

Robotics · Computer Science 2025-07-18 Giwon Lee , Daehee Park , Jaewoo Jeong , Kuk-Jin Yoon

Discrete Diffusion Trajectory Alignment via Stepwise Decomposition

Discrete diffusion models have demonstrated great promise in modeling various sequence data, ranging from human language to biological sequences. Inspired by the success of RL in language models, there is growing interest in further…

Machine Learning · Computer Science 2026-02-03 Jiaqi Han , Austin Wang , Minkai Xu , Wenda Chu , Meihua Dang , Haotian Ye , Huayu Chen , Yisong Yue , Stefano Ermon

Alignment and Safety of Diffusion Models via Reinforcement Learning and Reward Modeling: A Survey

Diffusion models have become a central paradigm for image and multimodal generation, yet their deployment raises persistent questions about alignment, safety, preference satisfaction, and robustness to misuse. This survey reviews recent…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Preeti Lamba , Kiran Ravish , Ankita Kushwaha , Pawan Kumar

Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference

Recent studies have demonstrated the effectiveness of directly aligning diffusion models with human preferences using differentiable reward. However, they exhibit two primary challenges: (1) they rely on multistep denoising with gradient…

Artificial Intelligence · Computer Science 2025-09-12 Xiangwei Shen , Zhimin Li , Zhantao Yang , Shiyi Zhang , Yingfang Zhang , Donghao Li , Chunyu Wang , Qinglin Lu , Yansong Tang

Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review

This tutorial provides an in-depth guide on inference-time guidance and alignment methods for optimizing downstream reward functions in diffusion models. While diffusion models are renowned for their generative modeling capabilities,…

Artificial Intelligence · Computer Science 2025-01-22 Masatoshi Uehara , Yulai Zhao , Chenyu Wang , Xiner Li , Aviv Regev , Sergey Levine , Tommaso Biancalani

Test-time Alignment of Diffusion Models without Reward Over-optimization

Diffusion models excel in generative tasks, but aligning them with specific objectives while maintaining their versatility remains challenging. Existing fine-tuning methods often suffer from reward over-optimization, while approximate…

Machine Learning · Computer Science 2025-04-18 Sunwoo Kim , Minkyu Kim , Dongmin Park

Diffusion-DRF: Free, Rich, and Differentiable Reward for Video Diffusion Fine-Tuning

Video diffusion alignment has been heavily relied on scalar rewards. These rewards are typically derived from learned reward models in human preference datasets, requiring additional training and extensive collection. Moreover, scalar…

Computer Vision and Pattern Recognition · Computer Science 2026-03-18 Yifan Wang , Yanyu Li , Gordon Guocheng Qian , Sergey Tulyakov , Yun Fu , Anil Kag

Direct Distributional Optimization for Provable Alignment of Diffusion Models

We introduce a novel alignment method for diffusion models from distribution optimization perspectives while providing rigorous convergence guarantees. We first formulate the problem as a generic regularized loss minimization over…

Machine Learning · Computer Science 2025-03-07 Ryotaro Kawata , Kazusato Oko , Atsushi Nitanda , Taiji Suzuki

ReAlign: Text-to-Motion Generation via Step-Aware Reward-Guided Alignment

Text-to-motion generation, which synthesizes 3D human motions from text inputs, holds immense potential for applications in gaming, film, and robotics. Recently, diffusion-based methods have been shown to generate more diversity and…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Wanjiang Weng , Xiaofeng Tan , Junbo Wang , Guo-Sen Xie , Pan Zhou , Hongsong Wang

Diffusion Alignment Beyond KL: Variance Minimisation as Effective Policy Optimiser

Diffusion alignment adapts pretrained diffusion models to sample from reward-tilted distributions along the denoising trajectory. This process naturally admits a Sequential Monte Carlo (SMC) interpretation, where the denoising model acts as…

Machine Learning · Computer Science 2026-02-13 Zijing Ou , Jacob Si , Junyi Zhu , Ondrej Bohdal , Mete Ozay , Taha Ceritli , Yingzhen Li

Diffusion-based learning framework for Constrained Nonconvex Optimization with Weighted Bootstrapped Refinement

Recent advances in diffusion models show promising potential to accelerate nonconvex problem solving by leveraging their multimodality. However, most existing diffusion-based optimization approaches rely on supervised learning and lack a…

Machine Learning · Computer Science 2026-05-29 Shutong Ding , Yimiao Zhou , Ke Hu , Xi Yao , Junchi Yan , Xiaoying Tang , Ye Shi

Training Diffusion Models with Reinforcement Learning

Diffusion models are a class of flexible generative models trained with an approximation to the log-likelihood objective. However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives…

Machine Learning · Computer Science 2024-01-08 Kevin Black , Michael Janner , Yilun Du , Ilya Kostrikov , Sergey Levine

DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning

Reinforcement Learning has become a standard paradigm for aligning Large Language Models with human intent and task requirements. While Group Relative Policy Optimization offers an efficient, value-model-free alternative to Proximal Policy…

Computation and Language · Computer Science 2026-05-26 Guochao Jiang , Jingyi Song , Guofeng Quan , Chuzhan Hao , Guohua Liu , Yuewei Zhang

Training-free Diffusion Model Alignment with Sampling Demons

Aligning diffusion models with user preferences has been a key challenge. Existing methods for aligning diffusion models either require retraining or are limited to differentiable reward functions. To address these limitations, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2025-02-28 Po-Hung Yeh , Kuang-Huei Lee , Jun-Cheng Chen

VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL

Diffusion models have emerged as powerful generative tools across various domains, yet tailoring pre-trained models to exhibit specific desirable properties remains challenging. While reinforcement learning (RL) offers a promising…

Computer Vision and Pattern Recognition · Computer Science 2025-06-03 Fengyuan Dai , Zifeng Zhuang , Yufei Huang , Siteng Huang , Bangyan Liao , Donglin Wang , Fajie Yuan