Related papers: Diffusion Policy through Conditional Proximal Poli…

Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning

Offline reinforcement learning (RL), which aims to learn an optimal policy using a previously collected static dataset, is an important paradigm of RL. Standard RL methods often perform poorly in this regime due to the function…

Machine Learning · Computer Science 2023-08-29 Zhendong Wang , Jonathan J Hunt , Mingyuan Zhou

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

Recent advances in reinforcement learning (RL) have demonstrated the powerful exploration capabilities and multimodality of generative diffusion-based policies. While substantial progress has been made in offline RL and off-policy RL…

Machine Learning · Computer Science 2026-01-23 Shutong Ding , Ke Hu , Shan Zhong , Haoyang Luo , Weinan Zhang , Jingya Wang , Jun Wang , Ye Shi

Reinforcement Learning with Discrete Diffusion Policies for Combinatorial Action Spaces

Reinforcement learning (RL) struggles to scale to large, combinatorial action spaces common in many real-world problems. This paper introduces a novel framework for training discrete diffusion models as highly effective policies in these…

Machine Learning · Computer Science 2026-05-21 Haitong Ma , Ofir Nabati , Aviv Rosenberg , Bo Dai , Oran Lang , Craig Boutilier , Na Li , Shie Mannor , Lior Shani , Guy Tenneholtz

Policy Representation via Diffusion Probability Model for Reinforcement Learning

Popular reinforcement learning (RL) algorithms tend to produce a unimodal policy distribution, which weakens the expressiveness of complicated policy and decays the ability of exploration. The diffusion probability model is powerful to…

Machine Learning · Computer Science 2023-05-23 Long Yang , Zhixiong Huang , Fenghao Lei , Yucun Zhong , Yiming Yang , Cong Fang , Shiting Wen , Binbin Zhou , Zhouchen Lin

Revisiting Generative Policies: A Simpler Reinforcement Learning Algorithmic Perspective

Generative models, particularly diffusion models, have achieved remarkable success in density estimation for multimodal data, drawing significant interest from the reinforcement learning (RL) community, especially in policy modeling in…

Machine Learning · Computer Science 2024-12-03 Jinouwen Zhang , Rongkun Xue , Yazhe Niu , Yun Chen , Jing Yang , Hongsheng Li , Yu Liu

Efficient Online Reinforcement Learning for Diffusion Policy

Diffusion policies have achieved superior performance in imitation learning and offline reinforcement learning (RL) due to their rich expressiveness. However, the conventional diffusion training procedure requires samples from target…

Machine Learning · Computer Science 2025-07-01 Haitong Ma , Tianyi Chen , Kai Wang , Na Li , Bo Dai

Score Regularized Policy Optimization through Diffusion Behavior

Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies. However, sampling from diffusion policies is considerably slow…

Machine Learning · Computer Science 2024-03-18 Huayu Chen , Cheng Lu , Zhengyi Wang , Hang Su , Jun Zhu

Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning

Offline reinforcement learning (RL) aims to learn optimal policies from previously collected datasets. Recently, due to their powerful representational capabilities, diffusion models have shown significant potential as policy models for…

Machine Learning · Computer Science 2024-05-30 Tianle Zhang , Jiayi Guan , Lin Zhao , Yihang Li , Dongjiang Li , Zecui Zeng , Lei Sun , Yue Chen , Xuelong Wei , Lusong Li , Xiaodong He

DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning

Offline reinforcement learning (RL) can learn optimal policies from pre-collected offline datasets without interacting with the environment, but the sampled actions of the agent cannot often cover the action distribution under a given…

Machine Learning · Computer Science 2024-06-14 Xuemin Hu , Shen Li , Yingfen Xu , Bo Tang , Long Chen

Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization

Diffusion models have garnered widespread attention in Reinforcement Learning (RL) for their powerful expressiveness and multimodality. It has been verified that utilizing diffusion policies can significantly improve the performance of RL…

Machine Learning · Computer Science 2024-12-17 Shutong Ding , Ke Hu , Zhenhao Zhang , Kan Ren , Weinan Zhang , Jingyi Yu , Jingya Wang , Ye Shi

Training Diffusion Models with Reinforcement Learning

Diffusion models are a class of flexible generative models trained with an approximation to the log-likelihood objective. However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives…

Machine Learning · Computer Science 2024-01-08 Kevin Black , Michael Janner , Yilun Du , Ilya Kostrikov , Sergey Levine

Flow-based Policy With Distributional Reinforcement Learning in Trajectory Optimization

Reinforcement Learning (RL) has proven highly effective in addressing complex control and decision-making tasks. However, in most traditional RL algorithms, the policy is typically parameterized as a diagonal Gaussian distribution, which…

Machine Learning · Computer Science 2026-04-02 Ruijie Hao , Longfei Zhang , Yang Dai , Yang Ma , Xingxing Liang , Guangquan Cheng

Dichotomous Diffusion Policy Optimization

Diffusion-based policies have gained growing popularity in solving a wide range of decision-making tasks due to their superior expressiveness and controllable generation during inference. However, effectively training large diffusion…

Machine Learning · Computer Science 2026-02-03 Ruiming Liang , Yinan Zheng , Kexin Zheng , Tianyi Tan , Jianxiong Li , Liyuan Mao , Zhihao Wang , Guang Chen , Hangjun Ye , Jingjing Liu , Jinqiao Wang , Xianyuan Zhan

Diffusion Policy Policy Optimization

We introduce Diffusion Policy Policy Optimization, DPPO, an algorithmic framework including best practices for fine-tuning diffusion-based policies (e.g. Diffusion Policy) in continuous control and robot learning tasks using the policy…

Robotics · Computer Science 2024-12-11 Allen Z. Ren , Justin Lidard , Lars L. Ankile , Anthony Simeonov , Pulkit Agrawal , Anirudha Majumdar , Benjamin Burchfiel , Hongkai Dai , Max Simchowitz

Mean Flow Policy Optimization

Diffusion models have recently emerged as expressive policy representations for online reinforcement learning (RL). However, their iterative generative processes introduce substantial training and inference overhead. To overcome this…

Machine Learning · Computer Science 2026-04-17 Xiaoyi Dong , Xi Sheryl Zhang , Jian Cheng

FlowRL: A Taxonomy and Modular Framework for Reinforcement Learning with Diffusion Policies

Thanks to their remarkable flexibility, diffusion models and flow models have emerged as promising candidates for policy representation. However, efficient reinforcement learning (RL) upon these policies remains a challenge due to the lack…

Machine Learning · Computer Science 2026-03-31 Chenxiao Gao , Edward Chen , Tianyi Chen , Bo Dai

Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance

Recent advances in reinforcement learning (RL) have achieved great successes by leveraging the multimodality and exploration capability of diffusion policies. Among these approaches, one representative branch focuses on the sampling-based…

Robotics · Computer Science 2026-05-29 Shutong Ding , Zejia Zhong , Zhongyi Wang , Ke Hu , Bikang Pan , Jingya Wang , Ye Shi

Overcoming Overfitting in Reinforcement Learning via Gaussian Process Diffusion Policy

One of the key challenges that Reinforcement Learning (RL) faces is its limited capability to adapt to a change of data distribution caused by uncertainties. This challenge arises especially in RL systems using deep neural networks as…

Machine Learning · Computer Science 2025-06-17 Amornyos Horprasert , Esa Apriaskar , Xingyu Liu , Lanlan Su , Lyudmila S. Mihaylova

Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning

Conditional decision generation with diffusion models has shown powerful competitiveness in reinforcement learning (RL). Recent studies reveal the relation between energy-function-guidance diffusion models and constrained RL problems. The…

Machine Learning · Computer Science 2025-05-06 Jifeng Hu , Sili Huang , Zhejian Yang , Shengchao Hu , Li Shen , Hechang Chen , Lichao Sun , Yi Chang , Dacheng Tao

Categorical Policies: Multimodal Policy Learning and Exploration in Continuous Control

A policy in deep reinforcement learning (RL), either deterministic or stochastic, is commonly parameterized as a Gaussian distribution alone, limiting the learned behavior to be unimodal. However, the nature of many practical…

Machine Learning · Computer Science 2025-08-20 SM Mazharul Islam , Manfred Huber