Related papers: When Data Geometry Meets Deep Function: Generalizi…

Offline Reinforcement Learning with On-Policy Q-Function Regularization

The core challenge of offline reinforcement learning (RL) is dealing with the (potentially catastrophic) extrapolation error induced by the distribution shift between the history dataset and the desired policy. A large portion of prior work…

Machine Learning · Computer Science 2023-07-27 Laixi Shi , Robert Dadashi , Yuejie Chi , Pablo Samuel Castro , Matthieu Geist

The Generalization Gap in Offline Reinforcement Learning

Despite recent progress in offline learning, these methods are still trained and tested on the same environment. In this paper, we compare the generalization abilities of widely used online and offline learning methods such as online…

Machine Learning · Computer Science 2024-03-18 Ishita Mediratta , Qingfei You , Minqi Jiang , Roberta Raileanu

Doubly Mild Generalization for Offline Reinforcement Learning

Offline Reinforcement Learning (RL) suffers from the extrapolation error and value overestimation. From a generalization perspective, this issue can be attributed to the over-generalization of value functions or policies towards…

Machine Learning · Computer Science 2024-11-14 Yixiu Mao , Qi Wang , Yun Qu , Yuhang Jiang , Xiangyang Ji

Domain Generalization for Robust Model-Based Offline Reinforcement Learning

Existing offline reinforcement learning (RL) algorithms typically assume that training data is either: 1) generated by a known policy, or 2) of entirely unknown origin. We consider multi-demonstrator offline RL, a middle ground where we…

Machine Learning · Computer Science 2022-11-29 Alan Clark , Shoaib Ahmed Siddiqui , Robert Kirk , Usman Anwar , Stephen Chung , David Krueger

Boosting Offline Reinforcement Learning via Data Rebalancing

Offline reinforcement learning (RL) is challenged by the distributional shift between learning policies and datasets. To address this problem, existing works mainly focus on designing sophisticated algorithms to explicitly or implicitly…

Machine Learning · Computer Science 2022-10-18 Yang Yue , Bingyi Kang , Xiao Ma , Zhongwen Xu , Gao Huang , Shuicheng Yan

Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps

Offline reinforcement learning (RL) aims to learn an optimal policy from a static dataset, making it particularly valuable in scenarios where data collection is costly, such as robotics. A major challenge in offline RL is distributional…

Machine Learning · Computer Science 2025-07-16 Motoki Omura , Yusuke Mukuta , Kazuki Ota , Takayuki Osa , Tatsuya Harada

Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods

In this paper, we explore deep reinforcement learning algorithms for vision-based robotic grasping. Model-free deep reinforcement learning (RL) has been successfully applied to a range of challenging environments, but the proliferation of…

Robotics · Computer Science 2018-03-30 Deirdre Quillen , Eric Jang , Ofir Nachum , Chelsea Finn , Julian Ibarz , Sergey Levine

Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies

When applying offline reinforcement learning (RL) in healthcare scenarios, the out-of-distribution (OOD) issues pose significant risks, as inappropriate generalization beyond clinical expertise can result in potentially harmful…

Machine Learning · Computer Science 2025-05-23 Runze Yan , Xun Shen , Akifumi Wachi , Sebastien Gros , Anni Zhao , Xiao Hu

Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations

Generalization and sample efficiency have been long-standing issues concerning reinforcement learning, and thus the field of Offline Meta-Reinforcement Learning~(OMRL) has gained increasing attention due to its potential of solving a wide…

Machine Learning · Computer Science 2023-12-27 Renzhe Zhou , Chen-Xiao Gao , Zongzhang Zhang , Yang Yu

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learning from static datasets, without interacting with the underlying environment during the learning process. A key challenge of offline RL is…

Machine Learning · Computer Science 2022-06-16 Shentao Yang , Yihao Feng , Shujian Zhang , Mingyuan Zhou

GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning

Deep Q Network (DQN) firstly kicked the door of deep reinforcement learning (DRL) via combining deep learning (DL) with reinforcement learning (RL), which has noticed that the distribution of the acquired data would change during the…

Machine Learning · Computer Science 2022-01-11 Jiajun Fan , Changnan Xiao , Yue Huang

State-Constrained Offline Reinforcement Learning

Traditional offline reinforcement learning (RL) methods predominantly operate in a batch-constrained setting. This confines the algorithms to a specific state-action distribution present in the dataset, reducing the effects of…

Machine Learning · Statistics 2025-07-16 Charles A. Hepburn , Yue Jin , Giovanni Montana

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems

With the widespread adoption of deep learning, reinforcement learning (RL) has experienced a dramatic increase in popularity, scaling to previously intractable problems, such as playing complex games from pixel observations, sustaining…

Machine Learning · Computer Science 2023-04-20 Rafael Figueiredo Prudencio , Marcos R. O. A. Maximo , Esther Luna Colombini

Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning

Offline Reinforcement Learning (RL) methods leverage previous experiences to learn better policies than the behavior policy used for data collection. However, they face challenges handling distribution shifts due to the lack of online…

Machine Learning · Computer Science 2025-06-09 Suzan Ece Ada , Erhan Oztop , Emre Ugur

Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation

Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the…

Machine Learning · Computer Science 2023-01-30 Xiaoteng Ma , Zhipeng Liang , Jose Blanchet , Mingwen Liu , Li Xia , Jiheng Zhang , Qianchuan Zhao , Zhengyuan Zhou

Offline Reinforcement Learning with Implicit Q-Learning

Offline reinforcement learning requires reconciling two conflicting aims: learning a policy that improves over the behavior policy that collected the dataset, while at the same time minimizing the deviation from the behavior policy so as to…

Machine Learning · Computer Science 2021-10-13 Ilya Kostrikov , Ashvin Nair , Sergey Levine

Iteratively Refined Behavior Regularization for Offline Reinforcement Learning

One of the fundamental challenges for offline reinforcement learning (RL) is ensuring robustness to data distribution. Whether the data originates from a near-optimal policy or not, we anticipate that an algorithm should demonstrate its…

Machine Learning · Computer Science 2023-10-18 Xiaohan Hu , Yi Ma , Chenjun Xiao , Yan Zheng , Jianye Hao

Unsupervised Data Generation for Offline Reinforcement Learning: A Perspective from Model

Offline reinforcement learning (RL) recently gains growing interests from RL researchers. However, the performance of offline RL suffers from the out-of-distribution problem, which can be corrected by feedback in online RL. Previous offline…

Machine Learning · Computer Science 2025-06-25 Shuncheng He , Hongchang Zhang , Jianzhun Shao , Yuhang Jiang , Xiangyang Ji

Offline Meta-Reinforcement Learning with Flow-Based Task Inference and Adaptive Correction of Feature Overgeneralization

Offline meta-reinforcement learning (OMRL) combines the strengths of learning from diverse datasets in offline RL with the adaptability to new tasks of meta-RL, promising safe and efficient knowledge acquisition by RL agents. However, OMRL…

Machine Learning · Computer Science 2026-01-13 Min Wang , Xin Li , Mingzhong Wang , Hasnaa Bennis

Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning

Offline reinforcement learning aims to utilize datasets of previously gathered environment-action interaction records to learn a policy without access to the real environment. Recent work has shown that offline reinforcement learning can be…

Machine Learning · Computer Science 2023-08-30 Hanhan Zhou , Tian Lan , Vaneet Aggarwal