Related papers: Offline Critic-Guided Diffusion Policy for Multi-U…

Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay

We study continual offline reinforcement learning, a practical paradigm that facilitates forward transfer and mitigates catastrophic forgetting to tackle sequential offline tasks. We propose a dual generative replay framework that retains…

Machine Learning · Computer Science 2024-04-19 Jinmei Liu , Wenbin Li , Xiangyu Yue , Shilin Zhang , Chunlin Chen , Zhi Wang

Structural Information-based Hierarchical Diffusion for Offline Reinforcement Learning

Diffusion-based generative methods have shown promising potential for modeling trajectories from offline reinforcement learning (RL) datasets, and hierarchical diffusion has been introduced to mitigate variance accumulation and…

Machine Learning · Computer Science 2025-09-29 Xianghua Zeng , Hao Peng , Angsheng Li , Yicheng Pan

Stitching Sub-Trajectories with Conditional Diffusion Model for Goal-Conditioned Offline RL

Offline Goal-Conditioned Reinforcement Learning (Offline GCRL) is an important problem in RL that focuses on acquiring diverse goal-oriented skills solely from pre-collected behavior datasets. In this setting, the reward feedback is…

Artificial Intelligence · Computer Science 2024-02-13 Sungyoon Kim , Yunseon Choi , Daiki E. Matsunaga , Kee-Eung Kim

Instructed Diffuser with Temporal Condition Guidance for Offline Reinforcement Learning

Recent works have shown the potential of diffusion models in computer vision and natural language processing. Apart from the classical supervised learning fields, diffusion models have also shown strong competitiveness in reinforcement…

Machine Learning · Computer Science 2023-06-09 Jifeng Hu , Yanchao Sun , Sili Huang , SiYuan Guo , Hechang Chen , Li Shen , Lichao Sun , Yi Chang , Dacheng Tao

Effective Multi-User Delay-Constrained Scheduling with Deep Recurrent Reinforcement Learning

Multi-user delay constrained scheduling is important in many real-world applications including wireless communication, live streaming, and cloud computing. Yet, it poses a critical challenge since the scheduler needs to make real-time…

Machine Learning · Computer Science 2022-08-31 Pihe Hu , Ling Pan , Yu Chen , Zhixuan Fang , Longbo Huang

Cross-Domain Energy-Guided Diffusion Generation for Off-Dynamics Reinforcement Learning

Off-dynamics offline reinforcement learning seeks to learn a target-domain policy from a large source dataset and a limited target dataset under mismatched transition dynamics. Existing approaches such as reward augmentation and data…

Machine Learning · Computer Science 2026-05-26 Yu Yang , Yihong Guo , Anqi Liu , Pan Xu

Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models

Generative models such as diffusion have been employed as world models in offline reinforcement learning to generate synthetic data for more effective learning. Existing work either generates diffusion models one-time prior to training or…

Machine Learning · Computer Science 2024-05-31 Zeyu Fang , Tian Lan

DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models

Reinforcement learning has emerged as a powerful tool for improving diffusion-based text-to-image models, but existing methods are largely limited to single-task optimization. Extending RL to multiple tasks is challenging: joint…

Machine Learning · Computer Science 2026-05-15 Quanhao Li , Junqiu Yu , Kaixun Jiang , Yujie Wei , Zhen Xing , Pandeng Li , Ruihang Chu , Shiwei Zhang , Yu Liu , Zuxuan Wu

How Does the Lagrangian Guide Safe Reinforcement Learning through Diffusion Models?

Diffusion policy sampling enables reinforcement learning (RL) to represent multimodal action distributions beyond suboptimal unimodal Gaussian policies. However, existing diffusion-based RL methods primarily focus on offline settings for…

Machine Learning · Computer Science 2026-05-07 Xiaoyuan Cheng , Wenxuan Yuan , Boyang Li , Yuanchao Xu , Yiming Yang , Hao Liang , Bei Peng , Robert Loftin , Zhuo Sun , Yukun Hu

CODA: Coordination via On-Policy Diffusion for Multi-Agent Offline Reinforcement Learning

Offline multi-agent reinforcement learning (MARL) enables policy learning from fixed datasets, but is prone to coordination failure: agents trained on static, off-policy data converge to suboptimal joint behaviours because they cannot…

Machine Learning · Computer Science 2026-04-28 Marcel Hedman , Kale-ab Abebe Tessera , Juan Claude Formanek , Anya Sims , Riccardo Zamboni , Trevor McInroe , John Torr , Elliot Fosong

Boosting Offline Reinforcement Learning via Data Rebalancing

Offline reinforcement learning (RL) is challenged by the distributional shift between learning policies and datasets. To address this problem, existing works mainly focus on designing sophisticated algorithms to explicitly or implicitly…

Machine Learning · Computer Science 2022-10-18 Yang Yue , Bingyi Kang , Xiao Ma , Zhongwen Xu , Gao Huang , Shuicheng Yan

Hybrid Online-Offline Learning for Task Offloading in Mobile Edge Computing Systems

We consider a multi-user multi-server mobile edge computing (MEC) system, in which users arrive on a network randomly over time and generate computation tasks, which will be computed either locally on their own computing devices or be…

Signal Processing · Electrical Eng. & Systems 2024-02-28 Muhammad Sohaib , Sang-Woon Jeon , Wei Yu

Offline Reinforcement Learning for Optimizing Production Bidding Policies

The online advertising market, with its thousands of auctions run per second, presents a daunting challenge for advertisers who wish to optimize their spend under a budget constraint. Thus, advertising platforms typically provide automated…

Machine Learning · Computer Science 2023-10-17 Dmytro Korenkevych , Frank Cheng , Artsiom Balakir , Alex Nikulkov , Lingnan Gao , Zhihao Cen , Zuobing Xu , Zheqing Zhu

Robust Policy Learning via Offline Skill Diffusion

Skill-based reinforcement learning (RL) approaches have shown considerable promise, especially in solving long-horizon tasks via hierarchical structures. These skills, learned task-agnostically from offline datasets, can accelerate the…

Machine Learning · Computer Science 2024-08-23 Woo Kyung Kim , Minjong Yoo , Honguk Woo

DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning

Offline reinforcement learning (RL) can learn optimal policies from pre-collected offline datasets without interacting with the environment, but the sampled actions of the agent cannot often cover the action distribution under a given…

Machine Learning · Computer Science 2024-06-14 Xuemin Hu , Shen Li , Yingfen Xu , Bo Tang , Long Chen

CGD: Constraint-Guided Diffusion Policies for UAV Trajectory Planning

Traditional optimization-based planners, while effective, suffer from high computational costs, resulting in slow trajectory generation. A successful strategy to reduce computation time involves using Imitation Learning (IL) to develop fast…

Robotics · Computer Science 2024-05-06 Kota Kondo , Andrea Tagliabue , Xiaoyi Cai , Claudius Tewari , Olivia Garcia , Marcos Espitia-Alvarez , Jonathan P. How

Efficient Controllable Diffusion via Optimal Classifier Guidance

The controllable generation of diffusion models aims to steer the model to generate samples that optimize some given objective functions. It is desirable for a variety of applications including image generation, molecule generation, and…

Machine Learning · Computer Science 2025-05-29 Owen Oertell , Shikun Sun , Yiding Chen , Jin Peng Zhou , Zhiyong Wang , Wen Sun

An Advanced Reinforcement Learning Framework for Online Scheduling of Deferrable Workloads in Cloud Computing

Efficient resource utilization and perfect user experience usually conflict with each other in cloud computing platforms. Great efforts have been invested in increasing resource utilization but trying not to affect users' experience for…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-04 Hang Dong , Liwen Zhu , Zhao Shan , Bo Qiao , Fangkai Yang , Si Qin , Chuan Luo , Qingwei Lin , Yuwen Yang , Gurpreet Virdi , Saravan Rajmohan , Dongmei Zhang , Thomas Moscibroda

SAMG: Offline-to-Online Reinforcement Learning via State-Action-Conditional Offline Model Guidance

Offline-to-online (O2O) reinforcement learning (RL) pre-trains models on offline data and refines policies through online fine-tuning. However, existing O2O RL algorithms typically require maintaining the tedious offline datasets to…

Machine Learning · Computer Science 2025-02-24 Liyu Zhang , Haochi Wu , Xu Wan , Quan Kong , Ruilong Deng , Mingyang Sun

Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL

Off-policy reinforcement learning (RL) has achieved notable success in tackling many complex real-world tasks, by leveraging previously collected data for policy learning. However, most existing off-policy RL algorithms fail to maximally…

Machine Learning · Computer Science 2024-05-30 Yu Luo , Tianying Ji , Fuchun Sun , Jianwei Zhang , Huazhe Xu , Xianyuan Zhan