Related papers: Hypercube Policy Regularization Framework for Offl…

A Behavior Regularized Implicit Policy for Offline Reinforcement Learning

Offline reinforcement learning enables learning from a fixed dataset, without further interactions with the environment. The lack of environmental interactions makes the policy training vulnerable to state-action pairs far from the training…

Machine Learning · Statistics 2022-10-11 Shentao Yang , Zhendong Wang , Huangjie Zheng , Yihao Feng , Mingyuan Zhou

SelfBC: Self Behavior Cloning for Offline Reinforcement Learning

Policy constraint methods in offline reinforcement learning employ additional regularization techniques to constrain the discrepancy between the learned policy and the offline dataset. However, these methods tend to result in overly…

Machine Learning · Computer Science 2024-08-06 Shirong Liu , Chenjia Bai , Zixian Guo , Hao Zhang , Gaurav Sharma , Yang Liu

Improving TD3-BC: Relaxed Policy Constraint for Offline Learning and Stable Online Fine-Tuning

The ability to discover optimal behaviour from fixed data sets has the potential to transfer the successes of reinforcement learning (RL) to domains where data collection is acutely problematic. In this offline setting, a key challenge is…

Machine Learning · Computer Science 2022-11-23 Alex Beeson , Giovanni Montana

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learning from static datasets, without interacting with the underlying environment during the learning process. A key challenge of offline RL is…

Machine Learning · Computer Science 2022-06-16 Shentao Yang , Yihao Feng , Shujian Zhang , Mingyuan Zhou

Iteratively Refined Behavior Regularization for Offline Reinforcement Learning

One of the fundamental challenges for offline reinforcement learning (RL) is ensuring robustness to data distribution. Whether the data originates from a near-optimal policy or not, we anticipate that an algorithm should demonstrate its…

Machine Learning · Computer Science 2023-10-18 Xiaohan Hu , Yi Ma , Chenjun Xiao , Yan Zheng , Jianye Hao

Evaluation-Time Policy Switching for Offline Reinforcement Learning

Offline reinforcement learning (RL) looks at learning how to optimally solve tasks using a fixed dataset of interactions from the environment. Many off-policy algorithms developed for online learning struggle in the offline setting as they…

Machine Learning · Computer Science 2025-03-18 Natinael Solomon Neggatu , Jeremie Houssineau , Giovanni Montana

Offline Reinforcement Learning with Behavioral Supervisor Tuning

Offline reinforcement learning (RL) algorithms are applied to learn performant, well-generalizing policies when provided with a static dataset of interactions. Many recent approaches to offline RL have seen substantial success, but with one…

Machine Learning · Computer Science 2024-07-30 Padmanaba Srinivasan , William Knottenbelt

BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement Learning

Online interactions with the environment to collect data samples for training a Reinforcement Learning (RL) agent is not always feasible due to economic and safety concerns. The goal of Offline Reinforcement Learning is to address this…

Machine Learning · Computer Science 2021-10-05 Chi Zhang , Sanmukh Rao Kuppannagari , Viktor K Prasanna

Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets

Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning…

Machine Learning · Computer Science 2023-10-13 Zhang-Wei Hong , Aviral Kumar , Sathwik Karnik , Abhishek Bhandwaldar , Akash Srivastava , Joni Pajarinen , Romain Laroche , Abhishek Gupta , Pulkit Agrawal

Offline Reinforcement Learning with Implicit Q-Learning

Offline reinforcement learning requires reconciling two conflicting aims: learning a policy that improves over the behavior policy that collected the dataset, while at the same time minimizing the deviation from the behavior policy so as to…

Machine Learning · Computer Science 2021-10-13 Ilya Kostrikov , Ashvin Nair , Sergey Levine

State-Constrained Offline Reinforcement Learning

Traditional offline reinforcement learning (RL) methods predominantly operate in a batch-constrained setting. This confines the algorithms to a specific state-action distribution present in the dataset, reducing the effects of…

Machine Learning · Statistics 2025-07-16 Charles A. Hepburn , Yue Jin , Giovanni Montana

Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning

Offline reinforcement learning (RL), which aims to learn an optimal policy using a previously collected static dataset, is an important paradigm of RL. Standard RL methods often perform poorly in this regime due to the function…

Machine Learning · Computer Science 2023-08-29 Zhendong Wang , Jonathan J Hunt , Mingyuan Zhou

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints

Offline reinforcement learning (RL) learns policies entirely from static datasets, thereby avoiding the challenges associated with online data collection. Practical applications of offline RL will inevitably require learning from datasets…

Machine Learning · Computer Science 2022-11-22 Anikait Singh , Aviral Kumar , Quan Vuong , Yevgen Chebotar , Sergey Levine

Policy Regularization with Dataset Constraint for Offline Reinforcement Learning

We consider the problem of learning the best possible policy from a fixed dataset, known as offline Reinforcement Learning (RL). A common taxonomy of existing offline RL works is policy regularization, which typically constrains the learned…

Machine Learning · Computer Science 2023-08-16 Yuhang Ran , Yi-Chen Li , Fuxiang Zhang , Zongzhang Zhang , Yang Yu

ReFORM: Reflected Flows for On-support Offline RL via Noise Manipulation

Offline reinforcement learning (RL) aims to learn the optimal policy from a fixed dataset generated by behavior policies without additional environment interactions. One common challenge that arises in this setting is the…

Machine Learning · Computer Science 2026-02-06 Songyuan Zhang , Oswin So , H. M. Sabbir Ahmad , Eric Yang Yu , Matthew Cleaveland , Mitchell Black , Chuchu Fan

Don't Trade Off Safety: Diffusion Regularization for Constrained Offline RL

Constrained reinforcement learning (RL) seeks high-performance policies under safety constraints. We focus on an offline setting where the agent has only a fixed dataset -- common in realistic tasks to prevent unsafe exploration. To address…

Machine Learning · Computer Science 2025-09-08 Junyu Guo , Zhi Zheng , Donghao Ying , Ming Jin , Shangding Gu , Costas Spanos , Javad Lavaei

Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning

Offline reinforcement learning, by learning from a fixed dataset, makes it possible to learn agent behaviors without interacting with the environment. However, depending on the quality of the offline dataset, such pre-trained agents may…

Machine Learning · Computer Science 2022-10-26 Yi Zhao , Rinu Boney , Alexander Ilin , Juho Kannala , Joni Pajarinen

Robust Predictable Control

Many of the challenges facing today's reinforcement learning (RL) algorithms, such as robustness, generalization, transfer, and computational efficiency are closely related to compression. Prior work has convincingly argued why minimizing…

Machine Learning · Computer Science 2021-09-08 Benjamin Eysenbach , Ruslan Salakhutdinov , Sergey Levine

Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions

Offline reinforcement learning (RL) allows for the training of competent agents from offline datasets without any interaction with the environment. Online finetuning of such offline models can further improve performance. But how should we…

Machine Learning · Computer Science 2023-03-31 Yicheng Luo , Jackie Kay , Edward Grefenstette , Marc Peter Deisenroth

Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices

Offline reinforcement learning (RL), which seeks to learn an optimal policy using offline data, has garnered significant interest due to its potential in critical applications where online data collection is infeasible or expensive. This…

Machine Learning · Computer Science 2024-02-09 Jiin Woo , Laixi Shi , Gauri Joshi , Yuejie Chi