English
Related papers

Related papers: Data-regularized Reinforcement Learning for Diffus…

200 papers

Distribution Matching Distillation (DMD) facilitates efficient inference by distilling multi-step diffusion models into few-step variants. Concurrently, Reinforcement Learning (RL) has emerged as a vital tool for aligning generative models…

Computer Vision and Pattern Recognition · Computer Science 2026-03-26 Dengyang Jiang , Dongyang Liu , Zanyi Wang , Qilong Wu , Liuzhuozheng Li , Hengzhuang Li , Xin Jin , David Liu , Changsheng Lu , Zhen Li , Bo Zhang , Mengmeng Wang , Steven Hoi , Peng Gao , Harry Yang

Reinforcement learning from human feedback (RLHF) has proven effective in aligning large language models with human preferences, inspiring the development of reward-centric diffusion reinforcement learning (RDRL) to achieve similar…

Machine Learning · Computer Science 2026-03-24 Kwanyoung Kim , Byeongsu Sim

Text-to-image diffusion models are a class of deep generative models that have demonstrated an impressive capacity for high-quality image generation. However, these models are susceptible to implicit biases that arise from web-scale…

Computer Vision and Pattern Recognition · Computer Science 2024-01-24 Yinan Zhang , Eric Tzeng , Yilun Du , Dmitry Kislyuk

Reinforcement learning (RL) struggles to scale to large, combinatorial action spaces common in many real-world problems. This paper introduces a novel framework for training discrete diffusion models as highly effective policies in these…

Machine Learning · Computer Science 2026-05-21 Haitong Ma , Ofir Nabati , Aviv Rosenberg , Bo Dai , Oran Lang , Craig Boutilier , Na Li , Shie Mannor , Lior Shani , Guy Tenneholtz

Diffusion and flow models achieve State-Of-The-Art (SOTA) generative performance, yet many practically important behaviors such as fine-grained prompt fidelity, compositional correctness, and text rendering are weakly specified by score or…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Yuanzhi Zhu , Xi Wang , Stéphane Lathuilière , Vicky Kalogeiton

Video diffusion alignment has been heavily relied on scalar rewards. These rewards are typically derived from learned reward models in human preference datasets, requiring additional training and extensive collection. Moreover, scalar…

Computer Vision and Pattern Recognition · Computer Science 2026-03-18 Yifan Wang , Yanyu Li , Gordon Guocheng Qian , Sergey Tulyakov , Yun Fu , Anil Kag

In the ever-changing and intricate landscape of financial markets, portfolio optimisation remains a formidable challenge for investors and asset managers. Conventional methods often struggle to capture the complex dynamics of market…

Machine Learning · Statistics 2025-10-09 Himanshu Choudhary , Arishi Orra , Manoj Thakur

To date, distributional reinforcement learning (distributional RL) methods have exclusively focused on the discounted setting, where an agent aims to optimize a discounted sum of rewards over time. In this work, we extend distributional RL…

Machine Learning · Computer Science 2026-01-14 Juan Sebastian Rojas , Chi-Guhn Lee

Constrained reinforcement learning (RL) seeks high-performance policies under safety constraints. We focus on an offline setting where the agent has only a fixed dataset -- common in realistic tasks to prevent unsafe exploration. To address…

Machine Learning · Computer Science 2025-09-08 Junyu Guo , Zhi Zheng , Donghao Ying , Ming Jin , Shangding Gu , Costas Spanos , Javad Lavaei

Deep Reinforcement Learning (DRL) has achieved great success in solving complicated decision-making problems. Despite the successes, DRL is frequently criticized for many reasons, e.g., data inefficient, inflexible and intractable reward…

Machine Learning · Computer Science 2023-02-07 Weiqin Chen

Diffusion models have achieved remarkable success in text-to-image generation. However, their practical applications are hindered by the misalignment between generated images and corresponding text prompts. To tackle this issue,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-28 Zijing Hu , Fengda Zhang , Long Chen , Kun Kuang , Jiahui Li , Kaifeng Gao , Jun Xiao , Xin Wang , Wenwu Zhu

The framework of deep reinforcement learning (DRL) provides a powerful and widely applicable mathematical formalization for sequential decision-making. This paper present a novel DRL framework, termed \emph{$f$-Divergence Reinforcement…

Machine Learning · Computer Science 2021-12-15 Chen Gong , Qiang He , Yunpeng Bai , Zhou Yang , Xiaoyu Chen , Xinwen Hou , Xianjie Zhang , Yu Liu , Guoliang Fan

This tutorial provides a comprehensive survey of methods for fine-tuning diffusion models to optimize downstream reward functions. While diffusion models are widely known to provide excellent generative modeling capability, practical…

Machine Learning · Computer Science 2024-07-19 Masatoshi Uehara , Yulai Zhao , Tommaso Biancalani , Sergey Levine

Many challenging real-world problems require the deployment of ensembles multiple complementary learning models to reach acceptable performance levels. While effective, applying the entire ensemble to every sample is costly and often…

Cryptography and Security · Computer Science 2022-09-20 Orel Lavie , Asaf Shabtai , Gilad Katz

Distributional reinforcement learning (DRL) is a recent reinforcement learning framework whose success has been supported by various empirical studies. It relies on the key idea of replacing the expected return with the return distribution,…

Machine Learning · Computer Science 2020-01-09 Rahul Singh , Keuntaek Lee , Yongxin Chen

Diffusion models are powerful generative models that allow for precise control over the characteristics of the generated samples. While these diffusion models trained on large datasets have achieved success, there is often a need to…

Dynamic resource allocation in mobile wireless networks involves complex, time-varying optimization problems, motivating the adoption of deep reinforcement learning (DRL). However, most existing works rely on pre-trained policies,…

Machine Learning · Computer Science 2025-02-12 Xinren Zhang , Jiadong Yu

Reinforcement learning (RL) algorithms have been used recently to align diffusion models with downstream objectives such as aesthetic quality and text-image consistency by fine-tuning them to maximize a single reward function under a fixed…

Artificial Intelligence · Computer Science 2026-03-13 Min Cheng , Fatemeh Doudi , Dileep Kalathil , Mohammad Ghavamzadeh , Panganamala R. Kumar

Many reinforcement learning (RL) tasks have specific properties that can be leveraged to modify existing RL algorithms to adapt to those tasks and further improve performance, and a general class of such properties is the multiple reward…

Machine Learning · Computer Science 2019-11-07 Zichuan Lin , Li Zhao , Derek Yang , Tao Qin , Guangwen Yang , Tie-Yan Liu

Diffusion models achieve state-of-the-art generative performance but are fundamentally bottlenecked by their slow, iterative sampling process. While diffusion distillation techniques enable high-fidelity, few-step generation, traditional…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Linqian Fan , Peiqin Sun , Tiancheng Wen , Shun Lu , Chengru Song
‹ Prev 1 2 3 10 Next ›