Related papers: Offline Safe Reinforcement Learning Using Trajecto…

Efficient Reinforcement Learning Through Trajectory Generation

A key barrier to using reinforcement learning (RL) in many real-world applications is the requirement of a large number of system interactions to learn a good control policy. Off-policy and Offline RL methods have been proposed to reduce…

Machine Learning · Computer Science 2022-12-02 Wenqi Cui , Linbin Huang , Weiwei Yang , Baosen Zhang

Safe Reinforcement Learning with Minimal Supervision

Reinforcement learning (RL) in the real world necessitates the development of procedures that enable agents to explore without causing harm to themselves or others. The most successful solutions to the problem of safe RL leverage offline…

Machine Learning · Computer Science 2025-01-09 Alexander Quessy , Thomas Richardson , Sebastian East

Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning

A popular framework for enforcing safe actions in Reinforcement Learning (RL) is Constrained RL, where trajectory based constraints on expected cost (or other cost measures) are employed to enforce safety and more importantly these…

Machine Learning · Computer Science 2024-08-09 Huy Hoang , Tien Mai , Pradeep Varakantham

Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets

Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning…

Machine Learning · Computer Science 2023-10-13 Zhang-Wei Hong , Aviral Kumar , Sathwik Karnik , Abhishek Bhandwaldar , Akash Srivastava , Joni Pajarinen , Romain Laroche , Abhishek Gupta , Pulkit Agrawal

Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation

A promising paradigm for offline reinforcement learning (RL) is to constrain the learned policy to stay close to the dataset behaviors, known as policy constraint offline RL. However, existing works heavily rely on the purity of the data,…

Machine Learning · Computer Science 2022-10-20 Chengqian Gao , Ke Xu , Liu Liu , Deheng Ye , Peilin Zhao , Zhiqiang Xu

Online Optimization for Offline Safe Reinforcement Learning

We study the problem of Offline Safe Reinforcement Learning (OSRL), where the goal is to learn a reward-maximizing policy from fixed data under a cumulative cost constraint. We propose a novel OSRL approach that frames the problem as a…

Machine Learning · Computer Science 2025-10-28 Yassine Chemingui , Aryan Deshwal , Alan Fern , Thanh Nguyen-Tang , Janardhan Rao Doppa

Offline Reinforcement Learning for Autonomous Driving with Safety and Exploration Enhancement

Reinforcement learning (RL) is a powerful data-driven control method that has been largely explored in autonomous driving tasks. However, conventional RL approaches learn control policies through trial-and-error interactions with the…

Robotics · Computer Science 2021-11-03 Tianyu Shi , Dong Chen , Kaian Chen , Zhaojian Li

Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning

In safe offline reinforcement learning (RL), the objective is to develop a policy that maximizes cumulative rewards while strictly adhering to safety constraints, utilizing only offline data. Traditional methods often face difficulties in…

Machine Learning · Computer Science 2026-02-11 Prajwal Koirala , Zhanhong Jiang , Soumik Sarkar , Cody Fleming

FOSP: Fine-tuning Offline Safe Policy through World Models

Offline Safe Reinforcement Learning (RL) seeks to address safety constraints by learning from static datasets and restricting exploration. However, these approaches heavily rely on the dataset and struggle to generalize to unseen scenarios…

Robotics · Computer Science 2025-03-04 Chenyang Cao , Yucheng Xin , Silang Wu , Longxiang He , Zichen Yan , Junbo Tan , Xueqian Wang

A Workflow for Offline Model-Free Robotic Reinforcement Learning

Offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction. This can allow robots to acquire generalizable skills from large and diverse datasets, without any…

Machine Learning · Computer Science 2021-09-24 Aviral Kumar , Anikait Singh , Stephen Tian , Chelsea Finn , Sergey Levine

Offline Reinforcement Learning as Anti-Exploration

Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, without interactions with the system. An agent in this setting should avoid selecting actions whose consequences cannot be predicted from the…

Machine Learning · Computer Science 2021-06-14 Shideh Rezaeifar , Robert Dadashi , Nino Vieillard , Léonard Hussenot , Olivier Bachem , Olivier Pietquin , Matthieu Geist

Constrained Decision Transformer for Offline Safe Reinforcement Learning

Safe reinforcement learning (RL) trains a constraint satisfaction policy by interacting with the environment. We aim to tackle a more challenging problem: learning a safe policy from an offline dataset. We study the offline safe RL problem…

Machine Learning · Computer Science 2023-06-22 Zuxin Liu , Zijian Guo , Yihang Yao , Zhepeng Cen , Wenhao Yu , Tingnan Zhang , Ding Zhao

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems

With the widespread adoption of deep learning, reinforcement learning (RL) has experienced a dramatic increase in popularity, scaling to previously intractable problems, such as playing complex games from pixel observations, sustaining…

Machine Learning · Computer Science 2023-04-20 Rafael Figueiredo Prudencio , Marcos R. O. A. Maximo , Esther Luna Colombini

Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones

Safety remains a central obstacle preventing widespread use of RL in the real world: learning new tasks in uncertain environments requires extensive exploration, but safety requires limiting exploration. We propose Recovery RL, an algorithm…

Machine Learning · Computer Science 2021-05-19 Brijen Thananjeyan , Ashwin Balakrishna , Suraj Nair , Michael Luo , Krishnan Srinivasan , Minho Hwang , Joseph E. Gonzalez , Julian Ibarz , Chelsea Finn , Ken Goldberg

Constraints Penalized Q-learning for Safe Offline Reinforcement Learning

We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while satisfying safety constraints given only offline data, without further interaction with the environment.…

Machine Learning · Computer Science 2022-04-11 Haoran Xu , Xianyuan Zhan , Xiangyu Zhu

Offline Safe Policy Optimization From Heterogeneous Feedback

Offline Preference-based Reinforcement Learning (PbRL) learns rewards and policies aligned with human preferences without the need for extensive reward engineering and direct interaction with human annotators. However, ensuring safety…

Artificial Intelligence · Computer Science 2025-12-24 Ze Gong , Pradeep Varakantham , Akshat Kumar

Offline Reinforcement Learning Hands-On

Offline Reinforcement Learning (RL) aims to turn large datasets into powerful decision-making engines without any online interactions with the environment. This great promise has motivated a large amount of research that hopes to replicate…

Machine Learning · Computer Science 2020-12-01 Louis Monier , Jakub Kmec , Alexandre Laterre , Thomas Pierrot , Valentin Courgeau , Olivier Sigaud , Karim Beguir

Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation

Offline Reinforcement Learning (RL) is a promising approach for learning optimal policies in environments where direct exploration is expensive or unfeasible. However, the adoption of such policies in practice is often challenging, as they…

Machine Learning · Computer Science 2020-11-03 Aaron Sonabend-W , Junwei Lu , Leo A. Celi , Tianxi Cai , Peter Szolovits

One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning

Offline reinforcement learning (RL) is suitable for safety-critical domains where online exploration is too costly or dangerous. In such safety-critical settings, decision-making should take into consideration the risk of catastrophic…

Machine Learning · Computer Science 2023-10-31 Marc Rigter , Bruno Lacerda , Nick Hawes

Safe Reinforcement Learning with Dead-Ends Avoidance and Recovery

Safety is one of the main challenges in applying reinforcement learning to realistic environmental tasks. To ensure safety during and after training process, existing methods tend to adopt overly conservative policy to avoid unsafe…

Machine Learning · Computer Science 2023-06-27 Xiao Zhang , Hai Zhang , Hongtu Zhou , Chang Huang , Di Zhang , Chen Ye , Junqiao Zhao