Related papers: Robust Reinforcement Learning using Offline Data

Robust Regularized Policy Iteration under Transition Uncertainty

Offline reinforcement learning (RL) enables data-efficient and safe policy learning without online exploration, but its performance often degrades under distribution shift. The learned policy may visit out-of-distribution state-action pairs…

Artificial Intelligence · Computer Science 2026-03-17 Hongqiang Lin , Zhenghui Fu , Weihao Tang , Pengfei Wang , Yiding Sun , Qixian Huang , Dongxu Zhang

Scalable Multi-Agent Offline Reinforcement Learning and the Role of Information

Offline Reinforcement Learning (RL) focuses on learning policies solely from a batch of previously collected data. offering the potential to leverage such datasets effectively without the need for costly or risky active exploration. While…

Machine Learning · Computer Science 2025-06-06 Riccardo Zamboni , Enrico Brunetti , Marcello Restelli

Residuals-based Offline Reinforcement Learning

Offline reinforcement learning (RL) has received increasing attention for learning policies from previously collected data without interaction with the real environment, which is particularly important in high-stakes applications. While a…

Machine Learning · Computer Science 2026-04-03 Qing Zhu , Xian Yu

Robust Offline Reinforcement Learning for Non-Markovian Decision Processes

Distributionally robust offline reinforcement learning (RL) aims to find a policy that performs the best under the worst environment within an uncertainty set using an offline dataset collected from a nominal model. While recent advances in…

Machine Learning · Computer Science 2025-01-07 Ruiquan Huang , Yingbin Liang , Jing Yang

Minimax Optimal and Computationally Efficient Algorithms for Distributionally Robust Offline Reinforcement Learning

Distributionally robust offline reinforcement learning (RL), which seeks robust policy training against environment perturbation by modeling dynamics uncertainty, calls for function approximations when facing large state-action spaces.…

Machine Learning · Computer Science 2025-11-03 Zhishuai Liu , Pan Xu

Learning Robust Options

Robust reinforcement learning aims to produce policies that have strong guarantees even in the face of environments/transition models whose parameters have strong uncertainty. Existing work uses value-based methods and the usual primitive…

Artificial Intelligence · Computer Science 2018-02-12 Daniel J. Mankowitz , Timothy A. Mann , Pierre-Luc Bacon , Doina Precup , Shie Mannor

Online Robust Reinforcement Learning with General Function Approximation

In many real-world settings, reinforcement learning systems suffer performance degradation when the environment encountered at deployment differs from that observed during training. Distributionally robust reinforcement learning (DR-RL)…

Machine Learning · Computer Science 2026-03-05 Debamita Ghosh , George K. Atia , Yue Wang

Towards Robust Offline Reinforcement Learning under Diverse Data Corruption

Offline reinforcement learning (RL) presents a promising approach for learning reinforced policies from offline datasets without the need for costly or unsafe interactions with the environment. However, datasets collected by humans in…

Machine Learning · Computer Science 2024-03-12 Rui Yang , Han Zhong , Jiawei Xu , Amy Zhang , Chongjie Zhang , Lei Han , Tong Zhang

Offline Reinforcement Learning for Autonomous Driving with Safety and Exploration Enhancement

Reinforcement learning (RL) is a powerful data-driven control method that has been largely explored in autonomous driving tasks. However, conventional RL approaches learn control policies through trial-and-error interactions with the…

Robotics · Computer Science 2021-11-03 Tianyu Shi , Dong Chen , Kaian Chen , Zhaojian Li

ORVIT: Near-Optimal Online Distributionally Robust Reinforcement Learning

We investigate reinforcement learning (RL) in the presence of distributional mismatch between training and deployment, where policies trained in simulators often underperform in practice due to mismatches between training and deployment…

Machine Learning · Computer Science 2025-11-12 Debamita Ghosh , George K. Atia , Yue Wang

Evaluation-Time Policy Switching for Offline Reinforcement Learning

Offline reinforcement learning (RL) looks at learning how to optimally solve tasks using a fixed dataset of interactions from the environment. Many off-policy algorithms developed for online learning struggle in the offline setting as they…

Machine Learning · Computer Science 2025-03-18 Natinael Solomon Neggatu , Jeremie Houssineau , Giovanni Montana

Online Robust Reinforcement Learning with Model Uncertainty

Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust RL, where the uncertainty set is defined to be centering at a…

Machine Learning · Computer Science 2021-10-29 Yue Wang , Shaofeng Zou

Offline Reinforcement Learning for Wireless Network Optimization with Mixture Datasets

The recent development of reinforcement learning (RL) has boosted the adoption of online RL for wireless radio resource management (RRM). However, online RL algorithms require direct interactions with the environment, which may be…

Information Theory · Computer Science 2023-11-21 Kun Yang , Cong Shen , Jing Yang , Shu-ping Yeh , Jerry Sydir

Robust Offline Reinforcement Learning -- Certify the Confidence Interval

Currently, reinforcement learning (RL), especially deep RL, has received more and more attention in the research area. However, the security of RL has been an obvious problem due to the attack manners becoming mature. In order to defend…

Machine Learning · Computer Science 2023-10-04 Jiarui Yao , Simon Shaolei Du

Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions

Offline reinforcement learning (RL) allows for the training of competent agents from offline datasets without any interaction with the environment. Online finetuning of such offline models can further improve performance. But how should we…

Machine Learning · Computer Science 2023-03-31 Yicheng Luo , Jackie Kay , Edward Grefenstette , Marc Peter Deisenroth

Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity

This paper concerns the central issues of model robustness and sample efficiency in offline reinforcement learning (RL), which aims to learn to perform decision making from history data without active exploration. Due to uncertainties and…

Machine Learning · Computer Science 2024-01-01 Laixi Shi , Yuejie Chi

MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator

Offline reinforcement learning (RL) faces a significant challenge of distribution shift. Model-free offline RL penalizes the Q value for out-of-distribution (OOD) data or constrains the policy closed to the behavior policy to tackle this…

Machine Learning · Computer Science 2024-04-18 Xiao-Yin Liu , Xiao-Hu Zhou , Guotao Li , Hao Li , Mei-Jiang Gui , Tian-Yu Xiang , De-Xing Huang , Zeng-Guang Hou

Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness

To obtain a near-optimal policy with fewer interactions in Reinforcement Learning (RL), a promising approach involves the combination of offline RL, which enhances sample efficiency by leveraging offline datasets, and online RL, which…

Machine Learning · Computer Science 2024-11-18 Xiaoyu Wen , Xudong Yu , Rui Yang , Haoyuan Chen , Chenjia Bai , Zhen Wang

Conservative Q-Learning for Offline Reinforcement Learning

Effectively leveraging large, previously collected datasets in reinforcement learning (RL) is a key challenge for large-scale real-world applications. Offline RL algorithms promise to learn effective policies from previously-collected,…

Machine Learning · Computer Science 2020-08-20 Aviral Kumar , Aurick Zhou , George Tucker , Sergey Levine

Constraints Penalized Q-learning for Safe Offline Reinforcement Learning

We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while satisfying safety constraints given only offline data, without further interaction with the environment.…

Machine Learning · Computer Science 2022-04-11 Haoran Xu , Xianyuan Zhan , Xiangyu Zhu