Related papers: Offline Reinforcement Learning with Adaptive Behav…

Adaptive Replay Buffer for Offline-to-Online Reinforcement Learning

Offline-to-Online Reinforcement Learning (O2O RL) faces a critical dilemma in balancing the use of a fixed offline dataset with newly collected online experiences. Standard methods, often relying on a fixed data-mixing ratio, struggle to…

Machine Learning · Computer Science 2026-04-09 Chihyeon Song , Jaewoo Lee , Jinkyoo Park

A Minimalist Approach to Offline Reinforcement Learning

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data. Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms take the approach of constraining or regularizing…

Machine Learning · Computer Science 2021-12-06 Scott Fujimoto , Shixiang Shane Gu

Iteratively Refined Behavior Regularization for Offline Reinforcement Learning

One of the fundamental challenges for offline reinforcement learning (RL) is ensuring robustness to data distribution. Whether the data originates from a near-optimal policy or not, we anticipate that an algorithm should demonstrate its…

Machine Learning · Computer Science 2023-10-18 Xiaohan Hu , Yi Ma , Chenjun Xiao , Yan Zheng , Jianye Hao

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learning from static datasets, without interacting with the underlying environment during the learning process. A key challenge of offline RL is…

Machine Learning · Computer Science 2022-06-16 Shentao Yang , Yihao Feng , Shujian Zhang , Mingyuan Zhou

Adaptive Policy Learning for Offline-to-Online Reinforcement Learning

Conventional reinforcement learning (RL) needs an environment to collect fresh data, which is impractical when online interactions are costly. Offline RL provides an alternative solution by directly learning from the previously collected…

Machine Learning · Computer Science 2023-03-15 Han Zheng , Xufang Luo , Pengfei Wei , Xuan Song , Dongsheng Li , Jing Jiang

Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets

Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning…

Machine Learning · Computer Science 2023-10-13 Zhang-Wei Hong , Aviral Kumar , Sathwik Karnik , Abhishek Bhandwaldar , Akash Srivastava , Joni Pajarinen , Romain Laroche , Abhishek Gupta , Pulkit Agrawal

Sample-Efficient Policy Constraint Offline Deep Reinforcement Learning based on Sample Filtering

Offline reinforcement learning (RL) aims to learn a policy that maximizes the expected return using a given static dataset of transitions. However, offline RL faces the distribution shift problem. The policy constraint offline RL method is…

Machine Learning · Computer Science 2025-12-24 Yuanhao Chen , Qi Liu , Pengbin Chen , Zhongjian Qiao , Yanjie Li

Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning

Offline reinforcement learning, by learning from a fixed dataset, makes it possible to learn agent behaviors without interacting with the environment. However, depending on the quality of the offline dataset, such pre-trained agents may…

Machine Learning · Computer Science 2022-10-26 Yi Zhao , Rinu Boney , Alexander Ilin , Juho Kannala , Joni Pajarinen

Active Advantage-Aligned Online Reinforcement Learning with Offline Data

Online reinforcement learning (RL) enhances policies through direct interactions with the environment, but faces challenges related to sample efficiency. In contrast, offline RL leverages extensive pre-collected data to learn policies, but…

Machine Learning · Computer Science 2026-03-10 Xuefeng Liu , Hung T. C. Le , Siyu Chen , Rick Stevens , Zhuoran Yang , Matthew R. Walter , Yuxin Chen

Critic Regularized Regression

Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction. It addresses challenges with regard to the cost of data…

Machine Learning · Computer Science 2021-09-24 Ziyu Wang , Alexander Novikov , Konrad Zolna , Jost Tobias Springenberg , Scott Reed , Bobak Shahriari , Noah Siegel , Josh Merel , Caglar Gulcehre , Nicolas Heess , Nando de Freitas

Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance

Offline reinforcement learning (RL) optimizes the policy on a previously collected dataset without any interactions with the environment, yet usually suffers from the distributional shift problem. To mitigate this issue, a typical solution…

Machine Learning · Computer Science 2023-09-06 Qisen Yang , Shenzhi Wang , Qihang Zhang , Gao Huang , Shiji Song

Multi-Objective Decision Transformers for Offline Reinforcement Learning

Offline Reinforcement Learning (RL) is structured to derive policies from static trajectory data without requiring real-time environment interactions. Recent studies have shown the feasibility of framing offline RL as a sequence modeling…

Machine Learning · Computer Science 2023-09-01 Abdelghani Ghanem , Philippe Ciblat , Mounir Ghogho

Evaluation-Time Policy Switching for Offline Reinforcement Learning

Offline reinforcement learning (RL) looks at learning how to optimally solve tasks using a fixed dataset of interactions from the environment. Many off-policy algorithms developed for online learning struggle in the offline setting as they…

Machine Learning · Computer Science 2025-03-18 Natinael Solomon Neggatu , Jeremie Houssineau , Giovanni Montana

Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning

Offline reinforcement learning (RL) offers a powerful paradigm for data-driven control. Compared to model-free approaches, offline model-based RL (MBRL) explicitly learns a world model from a static dataset and uses it as a surrogate…

Machine Learning · Computer Science 2026-02-02 Jiayu Chen , Le Xu , Aravind Venugopal , Jeff Schneider

BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement Learning

Online interactions with the environment to collect data samples for training a Reinforcement Learning (RL) agent is not always feasible due to economic and safety concerns. The goal of Offline Reinforcement Learning is to address this…

Machine Learning · Computer Science 2021-10-05 Chi Zhang , Sanmukh Rao Kuppannagari , Viktor K Prasanna

Boosting Offline Reinforcement Learning via Data Rebalancing

Offline reinforcement learning (RL) is challenged by the distributional shift between learning policies and datasets. To address this problem, existing works mainly focus on designing sophisticated algorithms to explicitly or implicitly…

Machine Learning · Computer Science 2022-10-18 Yang Yue , Bingyi Kang , Xiao Ma , Zhongwen Xu , Gao Huang , Shuicheng Yan

Reducing Conservativeness Oriented Offline Reinforcement Learning

In offline reinforcement learning, a policy learns to maximize cumulative rewards with a fixed collection of data. Towards conservative strategy, current methods choose to regularize the behavior policy or learn a lower bound of the value…

Machine Learning · Computer Science 2021-03-02 Hongchang Zhang , Jianzhun Shao , Yuhang Jiang , Shuncheng He , Xiangyang Ji

Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation

A promising paradigm for offline reinforcement learning (RL) is to constrain the learned policy to stay close to the dataset behaviors, known as policy constraint offline RL. However, existing works heavily rely on the purity of the data,…

Machine Learning · Computer Science 2022-10-20 Chengqian Gao , Ke Xu , Liu Liu , Deheng Ye , Peilin Zhao , Zhiqiang Xu

Sample Efficient Active Algorithms for Offline Reinforcement Learning

Offline reinforcement learning (RL) enables policy learning from static data but often suffers from poor coverage of the state-action space and distributional shift problems. This problem can be addressed by allowing limited online…

Machine Learning · Computer Science 2026-02-03 Soumyadeep Roy , Shashwat Kushwaha , Ambedkar Dukkipati

Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation

Model-based offline Reinforcement Learning (RL) constructs environment models from offline datasets to perform conservative policy optimization. Existing approaches focus on learning state transitions through ensemble models, rollouting…

Machine Learning · Computer Science 2025-03-27 Hongye Cao , Fan Feng , Jing Huo , Shangdong Yang , Meng Fang , Tianpei Yang , Yang Gao