Related papers: Sample Efficient Active Algorithms for Offline Rei…

Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance

Offline reinforcement learning (RL) optimizes the policy on a previously collected dataset without any interactions with the environment, yet usually suffers from the distributional shift problem. To mitigate this issue, a typical solution…

Machine Learning · Computer Science 2023-09-06 Qisen Yang , Shenzhi Wang , Qihang Zhang , Gao Huang , Shiji Song

Active Advantage-Aligned Online Reinforcement Learning with Offline Data

Online reinforcement learning (RL) enhances policies through direct interactions with the environment, but faces challenges related to sample efficiency. In contrast, offline RL leverages extensive pre-collected data to learn policies, but…

Machine Learning · Computer Science 2026-03-10 Xuefeng Liu , Hung T. C. Le , Siyu Chen , Rick Stevens , Zhuoran Yang , Matthew R. Walter , Yuxin Chen

Sample-Efficient Policy Constraint Offline Deep Reinforcement Learning based on Sample Filtering

Offline reinforcement learning (RL) aims to learn a policy that maximizes the expected return using a given static dataset of transitions. However, offline RL faces the distribution shift problem. The policy constraint offline RL method is…

Machine Learning · Computer Science 2025-12-24 Yuanhao Chen , Qi Liu , Pengbin Chen , Zhongjian Qiao , Yanjie Li

Offline Reinforcement Learning with Adaptive Behavior Regularization

Offline reinforcement learning (RL) defines a sample-efficient learning paradigm, where a policy is learned from static and previously collected datasets without additional interaction with the environment. The major obstacle to offline RL…

Machine Learning · Computer Science 2022-11-16 Yunfan Zhou , Xijun Li , Qingyu Qu

Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation

Offline Reinforcement Learning (RL) is a promising approach for learning optimal policies in environments where direct exploration is expensive or unfeasible. However, the adoption of such policies in practice is often challenging, as they…

Machine Learning · Computer Science 2020-11-03 Aaron Sonabend-W , Junwei Lu , Leo A. Celi , Tianxi Cai , Peter Szolovits

Instabilities of Offline RL with Pre-Trained Neural Representation

In offline reinforcement learning (RL), we seek to utilize offline data to evaluate (or learn) policies in scenarios where the data are collected from a distribution that substantially differs from that of the target policy to be evaluated.…

Machine Learning · Computer Science 2021-03-09 Ruosong Wang , Yifan Wu , Ruslan Salakhutdinov , Sham M. Kakade

Offline Reinforcement Learning with Additional Covering Distributions

We study learning optimal policies from a logged dataset, i.e., offline RL, with function approximation. Despite the efforts devoted, existing algorithms with theoretic finite-sample guarantees typically assume exploratory data coverage or…

Machine Learning · Computer Science 2023-05-25 Chenjie Mao

Evaluation-Time Policy Switching for Offline Reinforcement Learning

Offline reinforcement learning (RL) looks at learning how to optimally solve tasks using a fixed dataset of interactions from the environment. Many off-policy algorithms developed for online learning struggle in the offline setting as they…

Machine Learning · Computer Science 2025-03-18 Natinael Solomon Neggatu , Jeremie Houssineau , Giovanni Montana

Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets

Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning…

Machine Learning · Computer Science 2023-10-13 Zhang-Wei Hong , Aviral Kumar , Sathwik Karnik , Abhishek Bhandwaldar , Akash Srivastava , Joni Pajarinen , Romain Laroche , Abhishek Gupta , Pulkit Agrawal

Diffusion Policies for Risk-Averse Behavior Modeling in Offline Reinforcement Learning

Offline reinforcement learning (RL) presents distinct challenges as it relies solely on observational data. A central concern in this context is ensuring the safety of the learned policy by quantifying uncertainties associated with various…

Machine Learning · Computer Science 2025-07-03 Xiaocong Chen , Siyu Wang , Tong Yu , Lina Yao

Offline Reinforcement Learning with Implicit Q-Learning

Offline reinforcement learning requires reconciling two conflicting aims: learning a policy that improves over the behavior policy that collected the dataset, while at the same time minimizing the deviation from the behavior policy so as to…

Machine Learning · Computer Science 2021-10-13 Ilya Kostrikov , Ashvin Nair , Sergey Levine

Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability

Goal-conditioned reinforcement learning (GCRL) refers to learning general-purpose skills that aim to reach diverse goals. In particular, offline GCRL only requires purely pre-collected datasets to perform training tasks without additional…

Machine Learning · Computer Science 2023-10-13 Hanlin Zhu , Amy Zhang

ORVIT: Near-Optimal Online Distributionally Robust Reinforcement Learning

We investigate reinforcement learning (RL) in the presence of distributional mismatch between training and deployment, where policies trained in simulators often underperform in practice due to mismatches between training and deployment…

Machine Learning · Computer Science 2025-11-12 Debamita Ghosh , George K. Atia , Yue Wang

Adaptive Inverse Reinforcement Learning with Online Off-Policy Data Collection

In this paper, the inverse reinforcement learning (IRL) problem is addressed to reconstruct the unknown cost function underlying an observed optimal policy in a model-free manner, whose online adaptation with completely off-policy system…

Optimization and Control · Mathematics 2025-11-20 Yibei Li , Yuexin Cao , Zhixin Liu , Lihua Xie

Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps

We consider a challenging theoretical problem in offline reinforcement learning (RL): obtaining sample-efficiency guarantees with a dataset lacking sufficient coverage, under only realizability-type assumptions for the function…

Machine Learning · Computer Science 2022-06-16 Jinglin Chen , Nan Jiang

Gaussian Processes for Sample Efficient Reinforcement Learning with RMAX-like Exploration

We present an implementation of model-based online reinforcement learning (RL) for continuous domains with deterministic transitions that is specifically designed to achieve low sample complexity. To achieve low sample complexity, since the…

Artificial Intelligence · Computer Science 2012-02-01 Tobias Jung , Peter Stone

Improving and Benchmarking Offline Reinforcement Learning Algorithms

Recently, Offline Reinforcement Learning (RL) has achieved remarkable progress with the emergence of various algorithms and datasets. However, these methods usually focus on algorithmic advancements, ignoring that many low-level…

Machine Learning · Computer Science 2023-06-02 Bingyi Kang , Xiao Ma , Yirui Wang , Yang Yue , Shuicheng Yan

Iteratively Refined Behavior Regularization for Offline Reinforcement Learning

One of the fundamental challenges for offline reinforcement learning (RL) is ensuring robustness to data distribution. Whether the data originates from a near-optimal policy or not, we anticipate that an algorithm should demonstrate its…

Machine Learning · Computer Science 2023-10-18 Xiaohan Hu , Yi Ma , Chenjun Xiao , Yan Zheng , Jianye Hao

Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty

Robust reinforcement learning (RL) aims to find a policy that optimizes the worst-case performance in the face of uncertainties. In this paper, we focus on action robust RL with the probabilistic policy execution uncertainty, in which,…

Machine Learning · Computer Science 2023-07-21 Guanlin Liu , Zhihan Zhou , Han Liu , Lifeng Lai

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learning from static datasets, without interacting with the underlying environment during the learning process. A key challenge of offline RL is…

Machine Learning · Computer Science 2022-06-16 Shentao Yang , Yihao Feng , Shujian Zhang , Mingyuan Zhou