English
Related papers

Related papers: Mean Flow Policy Optimization

200 papers

Flow-based generative models, including diffusion models, excel at modeling continuous distributions in high-dimensional spaces. In this work, we introduce Flow Policy Optimization (FPO), a simple on-policy reinforcement learning algorithm…

Machine Learning · Computer Science 2025-08-04 David McAllister , Songwei Ge , Brent Yi , Chung Min Kim , Ethan Weber , Hongsuk Choi , Haiwen Feng , Angjoo Kanazawa

Diffusion policies have achieved great success in online reinforcement learning (RL) due to their strong expressive capacity. However, the inference of diffusion policy models relies on a slow iterative sampling process, which limits their…

Machine Learning · Computer Science 2025-10-17 Tianyi Chen , Haitong Ma , Na Li , Kai Wang , Bo Dai

Real-time robotic control demands fast action generation. However, existing generative policies based on diffusion and flow matching require multi-step sampling, fundamentally limiting deployment in time-critical scenarios. We propose…

Robotics · Computer Science 2026-01-29 Guowei Zou , Haitao Wang , Hejun Wu , Yukun Qian , Yuhang Wang , Weibing Li

Reinforcement learning (RL) has been extensively employed in a wide range of decision-making problems, such as games and robotics. Recently, diffusion policies have shown strong potential in modeling multi-modal behaviors, enabling more…

Machine Learning · Computer Science 2026-03-06 Ben Liu , Shunpeng Yang , Hua Chen

Diffusion policies are expressive yet incur high inference latency. Flow Matching (FM) enables one-step generation, but integrating it into Maximum Entropy Reinforcement Learning (MaxEnt RL) is challenging: the optimal policy is an…

Machine Learning · Computer Science 2026-02-03 Zeqiao Li , Yijing Wang , Haoyu Wang , Zheng Li , Zhiqiang Zuo

Generative models, particularly diffusion models, have achieved remarkable success in density estimation for multimodal data, drawing significant interest from the reinforcement learning (RL) community, especially in policy modeling in…

Machine Learning · Computer Science 2024-12-03 Jinouwen Zhang , Rongkun Xue , Yazhe Niu , Yun Chen , Jing Yang , Hongsheng Li , Yu Liu

Recent advances in reinforcement learning (RL) have demonstrated the powerful exploration capabilities and multimodality of generative diffusion-based policies. While substantial progress has been made in offline RL and off-policy RL…

Machine Learning · Computer Science 2026-01-23 Shutong Ding , Ke Hu , Shan Zhong , Haoyang Luo , Weinan Zhang , Jingya Wang , Jun Wang , Ye Shi

Diffusion models are a class of flexible generative models trained with an approximation to the log-likelihood objective. However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives…

Machine Learning · Computer Science 2024-01-08 Kevin Black , Michael Janner , Yilun Du , Ilya Kostrikov , Sergey Levine

Diffusion models have garnered widespread attention in Reinforcement Learning (RL) for their powerful expressiveness and multimodality. It has been verified that utilizing diffusion policies can significantly improve the performance of RL…

Machine Learning · Computer Science 2024-12-17 Shutong Ding , Ke Hu , Zhenhao Zhang , Kan Ren , Weinan Zhang , Jingyi Yu , Jingya Wang , Ye Shi

Thanks to their remarkable flexibility, diffusion models and flow models have emerged as promising candidates for policy representation. However, efficient reinforcement learning (RL) upon these policies remains a challenge due to the lack…

Machine Learning · Computer Science 2026-03-31 Chenxiao Gao , Edward Chen , Tianyi Chen , Bo Dai

We introduce Diffusion Policy Policy Optimization, DPPO, an algorithmic framework including best practices for fine-tuning diffusion-based policies (e.g. Diffusion Policy) in continuous control and robot learning tasks using the policy…

Popular reinforcement learning (RL) algorithms tend to produce a unimodal policy distribution, which weakens the expressiveness of complicated policy and decays the ability of exploration. The diffusion probability model is powerful to…

Machine Learning · Computer Science 2023-05-23 Long Yang , Zhixiong Huang , Fenghao Lei , Yucun Zhong , Yiming Yang , Cong Fang , Shiting Wen , Binbin Zhou , Zhouchen Lin

We propose DiFFPO, Diffusion Fast and Furious Policy Optimization, a unified framework for training masked diffusion large language models (dLLMs) to reason not only better (furious), but also faster via reinforcement learning (RL). We…

Machine Learning · Computer Science 2026-01-13 Hanyang Zhao , Dawen Liang , Wenpin Tang , David Yao , Nathan Kallus

Model-based reinforcement learning (RL) can be effectively supported at scale through the use of world models. However, in practice, scaling such approaches remains fundamentally limited. A commonly recognized challenge is model bias and…

Machine Learning · Computer Science 2026-05-27 Xiaoyuan Cheng , Wenxuan Yuan , Zhancun Mu , Yuanzhao Zhang , Yiming Yang , Hai Wang , Zhuo Sun , Che Liu

Generative models, especially diffusion and flow-based models, have been promising in offline multi-agent reinforcement learning. However, integrating powerful generative models into this framework poses unique challenges. In particular,…

Machine Learning · Computer Science 2026-03-02 Zhuoran Li , Xun Wang , Hai Zhong , Qingxin Xia , Lihua Zhang , Longbo Huang

We propose ReinFlow, a simple yet effective online reinforcement learning (RL) framework that fine-tunes a family of flow matching policies for continuous robotic control. Derived from rigorous RL theory, ReinFlow injects learnable noise…

Robotics · Computer Science 2026-01-09 Tonghe Zhang , Chao Yu , Sichang Su , Yu Wang

Diffusion and flow matching have emerged as expressive policy classes in reinforcement learning, but their reliance on multi-step denoising imposes substantial computational overhead at inference time, which is particularly problematic in…

Machine Learning · Computer Science 2026-05-25 Kyungyoon Kim , Donghyeon Ki , Hee-Jun Ahn , Byung-Jun Lee

Conditional decision generation with diffusion models has shown powerful competitiveness in reinforcement learning (RL). Recent studies reveal the relation between energy-function-guidance diffusion models and constrained RL problems. The…

Machine Learning · Computer Science 2025-05-06 Jifeng Hu , Sili Huang , Zhejian Yang , Shengchao Hu , Li Shen , Hechang Chen , Lichao Sun , Yi Chang , Dacheng Tao

Diffusion language models, as a promising alternative to traditional autoregressive (AR) models, enable faster generation and richer conditioning on bidirectional context. However, they suffer from a key discrepancy between training and…

Machine Learning · Computer Science 2025-09-26 Haoyu He , Katrin Renz , Yong Cao , Andreas Geiger

Offline reinforcement learning (RL) aims to learn optimal policies from previously collected datasets. Recently, due to their powerful representational capabilities, diffusion models have shown significant potential as policy models for…

Machine Learning · Computer Science 2024-05-30 Tianle Zhang , Jiayi Guan , Lin Zhao , Yihang Li , Dongjiang Li , Zecui Zeng , Lei Sun , Yue Chen , Xuelong Wei , Lusong Li , Xiaodong He
‹ Prev 1 2 3 10 Next ›