Related papers: Aligning Diffusion Models by Optimizing Human Util…

Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization

Aligning large language models with human preferences has emerged as a critical focus in language modeling research. Yet, integrating preference learning into Text-to-Image (T2I) generative models is still relatively uncharted territory.…

Computer Vision and Pattern Recognition · Computer Science 2024-06-11 Yi Gu , Zhendong Wang , Yueqin Yin , Yujia Xie , Mingyuan Zhou

Diffusion Model Alignment Using Direct Preference Optimization

Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences. In contrast to LLMs, human preference learning has…

Computer Vision and Pattern Recognition · Computer Science 2023-11-23 Bram Wallace , Meihua Dang , Rafael Rafailov , Linqi Zhou , Aaron Lou , Senthil Purushwalkam , Stefano Ermon , Caiming Xiong , Shafiq Joty , Nikhil Naik

DyMO: Training-Free Diffusion Model Alignment with Dynamic Multi-Objective Scheduling

Text-to-image diffusion model alignment is critical for improving the alignment between the generated images and human preferences. While training-based methods are constrained by high computational costs and dataset requirements,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-26 Xin Xie , Dong Gong

Divergence Minimization Preference Optimization for Diffusion Model Alignment

Diffusion models have achieved remarkable success in generating realistic and versatile images from text prompts. Inspired by the recent advancements of language models, there is an increasing interest in further improving the models by…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Binxu Li , Minkai Xu , Jiaqi Han , Meihua Dang , Stefano Ermon

Personalized Image Editing in Text-to-Image Diffusion Models via Collaborative Direct Preference Optimization

Text-to-image (T2I) diffusion models have made remarkable strides in generating and editing high-fidelity images from text. Yet, these models remain fundamentally generic, failing to adapt to the nuanced aesthetic preferences of individual…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Connor Dunlop , Matthew Zheng , Kavana Venkatesh , Pinar Yanardag

Calibrated Multi-Preference Optimization for Aligning Diffusion Models

Aligning text-to-image (T2I) diffusion models with preference optimization is valuable for human-annotated datasets, but the heavy cost of manual data collection limits scalability. Using reward models offers an alternative, however,…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Kyungmin Lee , Xiaohang Li , Qifei Wang , Junfeng He , Junjie Ke , Ming-Hsuan Yang , Irfan Essa , Jinwoo Shin , Feng Yang , Yinxiao Li

Rethinking Direct Preference Optimization in Diffusion Models

Aligning text-to-image (T2I) diffusion models with human preferences has emerged as a critical research challenge. While recent advances in this area have extended preference optimization techniques from large language models (LLMs) to the…

Computer Vision and Pattern Recognition · Computer Science 2025-12-25 Junyong Kang , Seohyun Lim , Kyungjune Baek , Hyunjung Shim

Towards Better Optimization For Listwise Preference in Diffusion Models

Reinforcement learning from human feedback (RLHF) has proven effectiveness for aligning text-to-image (T2I) diffusion models with human preferences. Although Direct Preference Optimization (DPO) is widely adopted for its computational…

Computer Vision and Pattern Recognition · Computer Science 2025-10-03 Jiamu Bai , Xin Yu , Meilong Xu , Weitao Lu , Xin Pan , Kiwan Maeng , Daniel Kifer , Jian Wang , Yu Wang

Alignment and Safety of Diffusion Models via Reinforcement Learning and Reward Modeling: A Survey

Diffusion models have become a central paradigm for image and multimodal generation, yet their deployment raises persistent questions about alignment, safety, preference satisfaction, and robustness to misuse. This survey reviews recent…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Preeti Lamba , Kiran Ravish , Ankita Kushwaha , Pawan Kumar

Bridging the Gap: Aligning Text-to-Image Diffusion Models with Specific Feedback

Learning from feedback has been shown to enhance the alignment between text prompts and images in text-to-image diffusion models. However, due to the lack of focus in feedback content, especially regarding the object type and quantity,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-03 Xuexiang Niu , Jinping Tang , Lei Wang , Ge Zhu

Aligning Diffusion Language Models via Unpaired Preference Optimization

Diffusion language models (dLLMs) are an emerging alternative to autoregressive (AR) generators, but aligning them to human preferences is challenging because sequence log-likelihoods are intractable and pairwise preference data are costly…

Machine Learning · Computer Science 2025-11-13 Vaibhav Jindal , Hejian Sang , Chun-Mao Lai , Yanning Chen , Zhipeng Wang

Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences

Direct Preference Optimization (DPO) aligns text-to-image (T2I) generation models with human preferences using pairwise preference data. Although substantial resources are expended in collecting and labeling datasets, a critical aspect is…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Yunhong Lu , Qichao Wang , Hengyuan Cao , Xiaoyin Xu , Min Zhang

Inference-Time Alignment of Diffusion Models with Direct Noise Optimization

In this work, we focus on the alignment problem of diffusion models with a continuous reward function, which represents specific objectives for downstream tasks, such as increasing darkness or improving the aesthetics of images. The central…

Machine Learning · Computer Science 2024-10-03 Zhiwei Tang , Jiangweizhi Peng , Jiasheng Tang , Mingyi Hong , Fan Wang , Tsung-Hui Chang

Alignment of Diffusion Model and Flow Matching for Text-to-Image Generation

Diffusion models and flow matching have demonstrated remarkable success in text-to-image generation. While many existing alignment methods primarily focus on fine-tuning pre-trained generative models to maximize a given reward function,…

Machine Learning · Statistics 2026-02-03 Yidong Ouyang , Liyan Xie , Hongyuan Zha , Guang Cheng

Aligning Diffusion Models with Noise-Conditioned Perception

Recent advancements in human preference optimization, initially developed for Language Models (LMs), have shown promise for text-to-image Diffusion Models, enhancing prompt alignment, visual appeal, and user preference. Unlike LMs,…

Computer Vision and Pattern Recognition · Computer Science 2025-12-03 Alexander Gambashidze , Anton Kulikov , Yuriy Sosnin , Ilya Makarov

Training Diffusion Models with Reinforcement Learning

Diffusion models are a class of flexible generative models trained with an approximation to the log-likelihood objective. However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives…

Machine Learning · Computer Science 2024-01-08 Kevin Black , Michael Janner , Yilun Du , Ilya Kostrikov , Sergey Levine

Free Lunch Alignment of Text-to-Image Diffusion Models without Preference Image Pairs

Recent advances in diffusion-based text-to-image (T2I) models have led to remarkable success in generating high-quality images from textual prompts. However, ensuring accurate alignment between the text and the generated image remains a…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Jia Jun Cheng Xian , Muchen Li , Haotian Yang , Xin Tao , Pengfei Wan , Leonid Sigal , Renjie Liao

Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models

Reinforcement learning (RL) algorithms have been used recently to align diffusion models with downstream objectives such as aesthetic quality and text-image consistency by fine-tuning them to maximize a single reward function under a fixed…

Artificial Intelligence · Computer Science 2026-03-13 Min Cheng , Fatemeh Doudi , Dileep Kalathil , Mohammad Ghavamzadeh , Panganamala R. Kumar

Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback

Direct preference optimization (DPO) methods have shown strong potential in aligning text-to-image diffusion models with human preferences by training on paired comparisons. These methods improve training stability by avoiding the REINFORCE…

Computer Vision and Pattern Recognition · Computer Science 2025-10-22 Yi-Lun Wu , Bo-Kai Ruan , Chiang Tseng , Hong-Han Shuai

InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Without using explicit reward, direct preference optimization (DPO) employs paired human preference data to fine-tune generative models, a method that has garnered considerable attention in large language models (LLMs). However, exploration…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Yunhong Lu , Qichao Wang , Hengyuan Cao , Xierui Wang , Xiaoyin Xu , Min Zhang