Related papers: Large-scale Reinforcement Learning for Diffusion M…

Feedback Efficient Online Fine-Tuning of Diffusion Models

Diffusion models excel at modeling complex data distributions, including those of images, proteins, and small molecules. However, in many cases, our goal is to model parts of the distribution that maximize certain properties: for example,…

Machine Learning · Computer Science 2024-07-19 Masatoshi Uehara , Yulai Zhao , Kevin Black , Ehsan Hajiramezanali , Gabriele Scalia , Nathaniel Lee Diamant , Alex M Tseng , Sergey Levine , Tommaso Biancalani

Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review

This tutorial provides a comprehensive survey of methods for fine-tuning diffusion models to optimize downstream reward functions. While diffusion models are widely known to provide excellent generative modeling capability, practical…

Machine Learning · Computer Science 2024-07-19 Masatoshi Uehara , Yulai Zhao , Tommaso Biancalani , Sergey Levine

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

Learning from human feedback has been shown to improve text-to-image models. These techniques first learn a reward function that captures what humans care about in the task and then improve the models based on the learned reward function.…

Machine Learning · Computer Science 2023-11-02 Ying Fan , Olivia Watkins , Yuqing Du , Hao Liu , Moonkyung Ryu , Craig Boutilier , Pieter Abbeel , Mohammad Ghavamzadeh , Kangwook Lee , Kimin Lee

Data-regularized Reinforcement Learning for Diffusion Models at Scale

Aligning generative diffusion models with human preferences via reinforcement learning (RL) is critical yet challenging. Most existing algorithms are often vulnerable to reward hacking, such as quality degradation, over-stylization, or…

Machine Learning · Computer Science 2025-12-25 Haotian Ye , Kaiwen Zheng , Jiashu Xu , Puheng Li , Huayu Chen , Jiaqi Han , Sheng Liu , Qinsheng Zhang , Hanzi Mao , Zekun Hao , Prithvijit Chattopadhyay , Dinghao Yang , Liang Feng , Maosheng Liao , Junjie Bai , Ming-Yu Liu , James Zou , Stefano Ermon

RL for Consistency Models: Faster Reward Guided Text-to-Image Generation

Reinforcement learning (RL) has improved guided image generation with diffusion models by directly optimizing rewards that capture image quality, aesthetics, and instruction following capabilities. However, the resulting generative policies…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Owen Oertell , Jonathan D. Chang , Yiyi Zhang , Kianté Brantley , Wen Sun

Optimizing 3D Diffusion Models for Medical Imaging via Multi-Scale Reward Learning

Diffusion models have emerged as powerful tools for 3D medical image generation, yet bridging the gap between standard training objectives and clinical relevance remains a challenge. This paper presents a method to enhance 3D diffusion…

Computer Vision and Pattern Recognition · Computer Science 2026-03-09 Yueying Tian , Xudong Han , Meng Zhou , Rodrigo Aviles-Espinosa , Rupert Young , Philip Birch

Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning

Personalized text-to-image models allow users to generate varied styles of images (specified with a sentence) for an object (specified with a set of reference images). While remarkable results have been achieved using diffusion-based…

Computer Vision and Pattern Recognition · Computer Science 2024-07-19 Fanyue Wei , Wei Zeng , Zhenyang Li , Dawei Yin , Lixin Duan , Wen Li

Training Diffusion Models with Reinforcement Learning

Diffusion models are a class of flexible generative models trained with an approximation to the log-likelihood objective. However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives…

Machine Learning · Computer Science 2024-01-08 Kevin Black , Michael Janner , Yilun Du , Ilya Kostrikov , Sergey Levine

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI), especially when compared with the remarkable progress made in fine-tuning Large Language Models (LLMs). While cutting-edge…

Machine Learning · Computer Science 2024-02-16 Huizhuo Yuan , Zixiang Chen , Kaixuan Ji , Quanquan Gu

Diffusion Models for Reinforcement Learning: A Survey

Diffusion models surpass previous generative models in sample quality and training stability. Recent works have shown the advantages of diffusion models in improving reinforcement learning (RL) solutions. This survey aims to provide an…

Machine Learning · Computer Science 2024-02-26 Zhengbang Zhu , Hanye Zhao , Haoran He , Yichao Zhong , Shenyu Zhang , Haoquan Guo , Tingting Chen , Weinan Zhang

Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards

Diffusion models have achieved remarkable success in text-to-image generation. However, their practical applications are hindered by the misalignment between generated images and corresponding text prompts. To tackle this issue,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-28 Zijing Hu , Fengda Zhang , Long Chen , Kun Kuang , Jiahui Li , Kaifeng Gao , Jun Xiao , Xin Wang , Wenwu Zhu

Enhancing Diffusion-based Restoration Models via Difficulty-Adaptive Reinforcement Learning with IQA Reward

Reinforcement Learning (RL) has recently been incorporated into diffusion models, e.g., tasks such as text-to-image. However, directly applying existing RL methods to diffusion-based image restoration models is suboptimal, as the objective…

Computer Vision and Pattern Recognition · Computer Science 2025-11-04 Xiaogang Xu , Ruihang Chu , Jian Wang , Kun Zhou , Wenjie Shu , Harry Yang , Ser-Nam Lim , Hao Chen , Liang Lin

Finite Difference Flow Optimization for RL Post-Training of Text-to-Image Models

Reinforcement learning (RL) has become a standard technique for post-training diffusion-based image synthesis models, as it enables learning from reward signals to explicitly improve desirable aspects such as image quality and prompt…

Computer Vision and Pattern Recognition · Computer Science 2026-03-16 David McAllister , Miika Aittala , Tero Karras , Janne Hellsten , Angjoo Kanazawa , Timo Aila , Samuli Laine

InstructRL4Pix: Training Diffusion for Image Editing by Reinforcement Learning

Instruction-based image editing has made a great process in using natural human language to manipulate the visual content of images. However, existing models are limited by the quality of the dataset and cannot accurately localize editing…

Computer Vision and Pattern Recognition · Computer Science 2024-06-17 Tiancheng Li , Jinxiu Liu , Huajun Chen , Qi Liu

Reinforcement Learning with Discrete Diffusion Policies for Combinatorial Action Spaces

Reinforcement learning (RL) struggles to scale to large, combinatorial action spaces common in many real-world problems. This paper introduces a novel framework for training discrete diffusion models as highly effective policies in these…

Machine Learning · Computer Science 2026-05-21 Haitong Ma , Ofir Nabati , Aviv Rosenberg , Bo Dai , Oran Lang , Craig Boutilier , Na Li , Shie Mannor , Lior Shani , Guy Tenneholtz

Critic-Guided Reinforcement Unlearning in Text-to-Image Diffusion

Machine unlearning in text-to-image diffusion models aims to remove targeted concepts while preserving overall utility. Prior diffusion unlearning methods typically rely on supervised weight edits or global penalties; reinforcement-learning…

Machine Learning · Computer Science 2026-02-17 Mykola Vysotskyi , Zahar Kohut , Mariia Shpir , Taras Rumezhak , Volodymyr Karpiv

Alignment and Safety of Diffusion Models via Reinforcement Learning and Reward Modeling: A Survey

Diffusion models have become a central paradigm for image and multimodal generation, yet their deployment raises persistent questions about alignment, safety, preference satisfaction, and robustness to misuse. This survey reviews recent…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Preeti Lamba , Kiran Ravish , Ankita Kushwaha , Pawan Kumar

Adding Conditional Control to Diffusion Models with Reinforcement Learning

Diffusion models are powerful generative models that allow for precise control over the characteristics of the generated samples. While these diffusion models trained on large datasets have achieved success, there is often a need to…

Machine Learning · Computer Science 2025-02-25 Yulai Zhao , Masatoshi Uehara , Gabriele Scalia , Sunyuan Kung , Tommaso Biancalani , Sergey Levine , Ehsan Hajiramezanali

Boosting Diffusion-Based Text Image Super-Resolution Model Towards Generalized Real-World Scenarios

Restoring low-resolution text images presents a significant challenge, as it requires maintaining both the fidelity and stylistic realism of the text in restored images. Existing text image restoration methods often fall short in hard…

Computer Vision and Pattern Recognition · Computer Science 2025-06-17 Chenglu Pan , Xiaogang Xu , Ganggui Ding , Yunke Zhang , Wenbo Li , Jiarong Xu , Qingbiao Wu

Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models

Reinforcement learning (RL) algorithms have been used recently to align diffusion models with downstream objectives such as aesthetic quality and text-image consistency by fine-tuning them to maximize a single reward function under a fixed…

Artificial Intelligence · Computer Science 2026-03-13 Min Cheng , Fatemeh Doudi , Dileep Kalathil , Mohammad Ghavamzadeh , Panganamala R. Kumar