Related papers: HyperAlign: Hypernetwork for Efficient Test-Time A…

Alignment of Diffusion Model and Flow Matching for Text-to-Image Generation

Diffusion models and flow matching have demonstrated remarkable success in text-to-image generation. While many existing alignment methods primarily focus on fine-tuning pre-trained generative models to maximize a given reward function,…

Machine Learning · Statistics 2026-02-03 Yidong Ouyang , Liyan Xie , Hongyuan Zha , Guang Cheng

ReAlign: Text-to-Motion Generation via Step-Aware Reward-Guided Alignment

Text-to-motion generation, which synthesizes 3D human motions from text inputs, holds immense potential for applications in gaming, film, and robotics. Recently, diffusion-based methods have been shown to generate more diversity and…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Wanjiang Weng , Xiaofeng Tan , Junbo Wang , Guo-Sen Xie , Pan Zhou , Hongsong Wang

Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models

The new paradigm of test-time scaling has yielded remarkable breakthroughs in Large Language Models (LLMs) (e.g. reasoning models) and in generative vision models, allowing models to allocate additional computation during inference to…

Machine Learning · Computer Science 2025-08-14 Luca Eyring , Shyamgopal Karthik , Alexey Dosovitskiy , Nataniel Ruiz , Zeynep Akata

Alignment and Safety of Diffusion Models via Reinforcement Learning and Reward Modeling: A Survey

Diffusion models have become a central paradigm for image and multimodal generation, yet their deployment raises persistent questions about alignment, safety, preference satisfaction, and robustness to misuse. This survey reviews recent…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Preeti Lamba , Kiran Ravish , Ankita Kushwaha , Pawan Kumar

Dynamic Search for Inference-Time Alignment in Diffusion Models

Diffusion models have shown promising generative capabilities across diverse domains, yet aligning their outputs with desired reward functions remains a challenge, particularly in cases where reward functions are non-differentiable. Some…

Machine Learning · Computer Science 2025-06-04 Xiner Li , Masatoshi Uehara , Xingyu Su , Gabriele Scalia , Tommaso Biancalani , Aviv Regev , Sergey Levine , Shuiwang Ji

Diffusion Alignment as Variational Expectation-Maximization

Diffusion alignment aims to optimize diffusion models for the downstream objective. While existing methods based on reinforcement learning or direct backpropagation achieve considerable success in maximizing rewards, they often suffer from…

Machine Learning · Computer Science 2026-03-09 Jaewoo Lee , Minsu Kim , Sanghyeok Choi , Inhyuck Song , Sujin Yun , Hyeongyu Kang , Woocheol Shin , Taeyoung Yun , Kiyoung Om , Jinkyoo Park

ImageReFL: Balancing Quality and Diversity in Human-Aligned Diffusion Models

Recent advances in diffusion models have led to impressive image generation capabilities, but aligning these models with human preferences remains challenging. Reward-based fine-tuning using models trained on human feedback improves…

Computer Vision and Pattern Recognition · Computer Science 2025-05-29 Dmitrii Sorokin , Maksim Nakhodnov , Andrey Kuznetsov , Aibek Alanov

Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review

This tutorial provides an in-depth guide on inference-time guidance and alignment methods for optimizing downstream reward functions in diffusion models. While diffusion models are renowned for their generative modeling capabilities,…

Artificial Intelligence · Computer Science 2025-01-22 Masatoshi Uehara , Yulai Zhao , Chenyu Wang , Xiner Li , Aviv Regev , Sergey Levine , Tommaso Biancalani

Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models

Reinforcement learning (RL) algorithms have been used recently to align diffusion models with downstream objectives such as aesthetic quality and text-image consistency by fine-tuning them to maximize a single reward function under a fixed…

Artificial Intelligence · Computer Science 2026-03-13 Min Cheng , Fatemeh Doudi , Dileep Kalathil , Mohammad Ghavamzadeh , Panganamala R. Kumar

InfAlign: Inference-aware language model alignment

Language model alignment is a critical step in training modern generative language models. Alignment targets to improve win rate of a sample from the aligned model against the base model. Today, we are increasingly using inference-time…

Machine Learning · Computer Science 2025-08-22 Ananth Balashankar , Ziteng Sun , Jonathan Berant , Jacob Eisenstein , Michael Collins , Adrian Hutter , Jong Lee , Chirag Nagpal , Flavien Prost , Aradhana Sinha , Ananda Theertha Suresh , Ahmad Beirami

Improving GFlowNets for Text-to-Image Diffusion Alignment

Diffusion models have become the de-facto approach for generating visual data, which are trained to match the distribution of the training dataset. In addition, we also want to control generation to fulfill desired properties such as…

Machine Learning · Computer Science 2024-12-30 Dinghuai Zhang , Yizhe Zhang , Jiatao Gu , Ruixiang Zhang , Josh Susskind , Navdeep Jaitly , Shuangfei Zhai

TextAlign: Preference Alignment for Text Rendering with Hierarchical Rewards

Faithful text rendering remains a persistent weakness of large text-to-image generative models, as it requires both semantic instruction following and fine-grained glyph-level structure. Prior methods often improve this ability through…

Computer Vision and Pattern Recognition · Computer Science 2026-05-20 Mingxuan Cui , Jingpu Yang , Fengxian Ji , Qian Jiang , Zhecheng Shi , Jiaming Wang , Zirui Song , Fajri Koto , Xiuying Chen

Composition and Alignment of Diffusion Models using Constrained Learning

Diffusion models have become prevalent in generative modeling due to their ability to sample from complex distributions. To improve the quality of generated samples and their compliance with user requirements, two commonly used methods are:…

Machine Learning · Computer Science 2025-12-01 Shervin Khalafi , Ignacio Hounie , Dongsheng Ding , Alejandro Ribeiro

HyperAlign: Hyperbolic Entailment Cones for Adaptive Text-to-Image Alignment Assessment

With the rapid development of text-to-image generation technology, accurately assessing the alignment between generated images and text prompts has become a critical challenge. Existing methods rely on Euclidean space metrics, neglecting…

Computer Vision and Pattern Recognition · Computer Science 2026-03-20 Wenzhi Chen , Bo Hu , Leida Li , Lihuo He , Wen Lu , Xinbo Gao

LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories

This paper focuses on the alignment of flow matching models with human preferences. A promising way is fine-tuning by directly backpropagating reward gradients through the differentiable generation process of flow matching. However,…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Zhanhao Liang , Tao Yang , Jie Wu , Chengjian Feng , Liang Zheng

Inference-Time Alignment of Diffusion Models via Evolutionary Algorithms

Diffusion models are state-of-the-art generative models, yet their samples often fail to satisfy application objectives such as safety constraints or domain-specific validity. Existing techniques for alignment require gradients, internal…

Machine Learning · Computer Science 2025-11-27 Purvish Jajal , Nick John Eliopoulos , Benjamin Shiue-Hal Chou , George K. Thiruvathukal , James C. Davis , Yung-Hsiang Lu

Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference

Recent studies have demonstrated the effectiveness of directly aligning diffusion models with human preferences using differentiable reward. However, they exhibit two primary challenges: (1) they rely on multistep denoising with gradient…

Artificial Intelligence · Computer Science 2025-09-12 Xiangwei Shen , Zhimin Li , Zhantao Yang , Shiyi Zhang , Yingfang Zhang , Donghao Li , Chunyu Wang , Qinglin Lu , Yansong Tang

Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance

Denoising-based generative models, particularly diffusion and flow matching algorithms, have achieved remarkable success. However, aligning their output distributions with complex downstream objectives, such as human preferences,…

Machine Learning · Computer Science 2025-08-29 Luozhijie Jin , Zijie Qiu , Jie Liu , Zijie Diao , Lifeng Qiao , Ning Ding , Alex Lamb , Xipeng Qiu

Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search

The remarkable progress in text-to-video diffusion models enables the generation of photorealistic videos, although the content of these generated videos often includes unnatural movement or deformation, reverse playback, and motionless…

Computer Vision and Pattern Recognition · Computer Science 2025-10-08 Yuta Oshima , Masahiro Suzuki , Yutaka Matsuo , Hiroki Furuta

UniAlignment: Semantic Alignment for Unified Image Generation, Understanding, Manipulation and Perception

The remarkable success of diffusion models in text-to-image generation has sparked growing interest in expanding their capabilities to a variety of multi-modal tasks, including image understanding, manipulation, and perception. These tasks…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Xinyang Song , Libin Wang , Weining Wang , Shaozhen Liu , Dandan Zheng , Jingdong Chen , Qi Li , Zhenan Sun