Related papers: Policy-Guided Diffusion

World Models via Policy-Guided Trajectory Diffusion

World models are a powerful tool for developing intelligent agents. By predicting the outcome of a sequence of actions, world models enable policies to be optimised via on-policy reinforcement learning (RL) using synthetic data, i.e. in "in…

Machine Learning · Computer Science 2024-03-28 Marc Rigter , Jun Yamada , Ingmar Posner

Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models

Generative models such as diffusion have been employed as world models in offline reinforcement learning to generate synthetic data for more effective learning. Existing work either generates diffusion models one-time prior to training or…

Machine Learning · Computer Science 2024-05-31 Zeyu Fang , Tian Lan

Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning

One important property of DIstribution Correction Estimation (DICE) methods is that the solution is the optimal stationary distribution ratio between the optimized and data collection policy. In this work, we show that DICE-based methods…

Machine Learning · Computer Science 2024-11-01 Liyuan Mao , Haoran Xu , Xianyuan Zhan , Weinan Zhang , Amy Zhang

Enabling Stateful Behaviors for Diffusion-based Policy Learning

While imitation learning provides a simple and effective framework for policy learning, acquiring consistent actions during robot execution remains a challenging task. Existing approaches primarily focus on either modifying the action…

Robotics · Computer Science 2024-07-24 Xiao Liu , Fabian Weigend , Yifan Zhou , Heni Ben Amor

DyDiff: Long-Horizon Rollout via Dynamics Diffusion for Offline Reinforcement Learning

With the great success of diffusion models (DMs) in generating realistic synthetic vision data, many researchers have investigated their potential in decision-making and control. Most of these works utilized DMs to sample directly from the…

Machine Learning · Computer Science 2026-05-19 Hanye Zhao , Xiaoshen Han , Zhengbang Zhu , Minghuan Liu , Yong Yu , De-Chuan Zhan , Weinan Zhang

Score Regularized Policy Optimization through Diffusion Behavior

Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies. However, sampling from diffusion policies is considerably slow…

Machine Learning · Computer Science 2024-03-18 Huayu Chen , Cheng Lu , Zhengyi Wang , Hang Su , Jun Zhu

WARPD: World model Assisted Reactive Policy Diffusion

With the increasing availability of open-source robotic data, imitation learning has become a promising approach for both manipulation and locomotion. Diffusion models are now widely used to train large, generalized policies that predict…

Machine Learning · Computer Science 2025-12-15 Shashank Hegde , Satyajeet Das , Gautam Salhotra , Gaurav S. Sukhatme

FlowQ: Energy-Guided Flow Policies for Offline Reinforcement Learning

The use of guidance to steer sampling toward desired outcomes has been widely explored within diffusion models, especially in applications such as image and trajectory generation. However, incorporating guidance during training remains…

Machine Learning · Computer Science 2025-05-21 Marvin Alles , Nutan Chen , Patrick van der Smagt , Botond Cseke

Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay

We study continual offline reinforcement learning, a practical paradigm that facilitates forward transfer and mitigates catastrophic forgetting to tackle sequential offline tasks. We propose a dual generative replay framework that retains…

Machine Learning · Computer Science 2024-04-19 Jinmei Liu , Wenbin Li , Xiangyu Yue , Shilin Zhang , Chunlin Chen , Zhi Wang

Beyond Imitation: Reinforcement Learning Fine-Tuning for Adaptive Diffusion Navigation Policies

Diffusion-based robot navigation policies trained on large-scale imitation learning datasets, can generate multi-modal trajectories directly from the robot's visual observations, bypassing the traditional localization-mapping-planning…

Robotics · Computer Science 2026-03-16 Junhe Sheng , Ruofei Bai , Kuan Xu , Ruimeng Liu , Jie Chen , Shenghai Yuan , Wei-Yun Yau , Lihua Xie

Planning with Diffusion for Flexible Behavior Synthesis

Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers. While conceptually simple,…

Machine Learning · Computer Science 2022-12-22 Michael Janner , Yilun Du , Joshua B. Tenenbaum , Sergey Levine

PPGuide: Steering Diffusion Policies with Performance Predictive Guidance

Diffusion policies have shown to be very efficient at learning complex, multi-modal behaviors for robotic manipulation. However, errors in generated action sequences can compound over time which can potentially lead to failure. Some…

Robotics · Computer Science 2026-03-12 Zixing Wang , Devesh K. Jha , Ahmed H. Qureshi , Diego Romeres

Multi-Agent Formation Navigation Using Diffusion-Based Trajectory Generation

This paper introduces a diffusion-based planner for leader--follower formation control in cluttered environments. The diffusion policy is used to generate the trajectory of the midpoint of two leaders as a rigid bar in the plane, thereby…

Robotics · Computer Science 2026-01-19 Hieu Do Quang , Chien Truong-Quoc , Quoc Van Tran

DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance

Deploying large, complex policies in the real world requires the ability to steer them to fit the needs of a situation. Most common steering approaches, like goal-conditioning, require training the robot policy with a distribution of…

Robotics · Computer Science 2025-11-11 Maximilian Du , Shuran Song

Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models

Diffusion models have emerged as powerful generative frameworks by progressively adding noise to data through a forward process and then reversing this process to generate realistic samples. While these models have achieved strong…

Machine Learning · Computer Science 2025-03-04 Xingzhuo Guo , Yu Zhang , Baixu Chen , Haoran Xu , Jianmin Wang , Mingsheng Long

Model Generation with Provable Coverability for Offline Reinforcement Learning

Model-based offline optimization with dynamics-aware policy provides a new perspective for policy learning and out-of-distribution generalization, where the learned policy could adapt to different dynamics enumerated at the training stage.…

Machine Learning · Computer Science 2022-06-09 Chengxing Jia , Hao Yin , Chenxiao Gao , Tian Xu , Lei Yuan , Zongzhang Zhang , Yang Yu

Distributed Policy Evaluation Under Multiple Behavior Strategies

We apply diffusion strategies to develop a fully-distributed cooperative reinforcement learning algorithm in which agents in a network communicate only with their immediate neighbors to improve predictions about their environment. The…

Multiagent Systems · Computer Science 2014-11-06 Sergio Valcarcel Macua , Jianshu Chen , Santiago Zazo , Ali H. Sayed

Improved off-policy training of diffusion samplers

We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational…

Machine Learning · Computer Science 2025-01-15 Marcin Sendera , Minsu Kim , Sarthak Mittal , Pablo Lemos , Luca Scimeca , Jarrid Rector-Brooks , Alexandre Adam , Yoshua Bengio , Nikolay Malkin

Diffusion Predictive Control with Constraints

Diffusion models have become popular for policy learning in robotics due to their ability to capture high-dimensional and multimodal distributions. However, diffusion policies are stochastic and typically trained offline, limiting their…

Robotics · Computer Science 2025-05-28 Ralf Römer , Alexander von Rohr , Angela P. Schoellig

Diffusion-Based Environment-Aware Trajectory Prediction

The ability to predict the future trajectories of traffic participants is crucial for the safe and efficient operation of autonomous vehicles. In this paper, a diffusion-based generative model for multi-agent trajectory prediction is…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Theodor Westny , Björn Olofsson , Erik Frisk