Related papers: Safe Policy Optimization with Local Generalized Li…

Safe Exploration via Policy Priors

Safe exploration is a key requirement for reinforcement learning (RL) agents to learn and adapt online, beyond controlled (e.g. simulated) environments. In this work, we tackle this challenge by utilizing suboptimal yet conservative…

Machine Learning · Computer Science 2026-02-10 Manuel Wendl , Yarden As , Manish Prajapat , Anton Pollak , Stelian Coros , Andreas Krause

Safety Optimized Reinforcement Learning via Multi-Objective Policy Optimization

Safe reinforcement learning (Safe RL) refers to a class of techniques that aim to prevent RL algorithms from violating constraints in the process of decision-making and exploration during trial and error. In this paper, a novel model-free…

Systems and Control · Electrical Eng. & Systems 2024-08-14 Homayoun Honari , Mehran Ghafarian Tamizi , Homayoun Najjaran

Iterative Reachability Estimation for Safe Reinforcement Learning

Ensuring safety is important for the practical deployment of reinforcement learning (RL). Various challenges must be addressed, such as handling stochasticity in the environments, providing rigorous guarantees of persistent state-wise…

Machine Learning · Computer Science 2023-09-26 Milan Ganai , Zheng Gong , Chenning Yu , Sylvia Herbert , Sicun Gao

A Provable Approach for End-to-End Safe Reinforcement Learning

A longstanding goal in safe reinforcement learning (RL) is a method to ensure the safety of a policy throughout the entire process, from learning to operation. However, existing safe RL paradigms inherently struggle to achieve this…

Machine Learning · Computer Science 2025-05-29 Akifumi Wachi , Kohei Miyaguchi , Takumi Tanabe , Rei Sato , Youhei Akimoto

Safe Continuous Control with Constrained Model-Based Policy Optimization

The applicability of reinforcement learning (RL) algorithms in real-world domains often requires adherence to safety constraints, a need difficult to address given the asymptotic nature of the classic RL optimization objective. In contrast…

Machine Learning · Computer Science 2021-04-15 Moritz A. Zanger , Karam Daaboul , J. Marius Zöllner

Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

In many real-world applications, reinforcement learning (RL) agents might have to solve multiple tasks, each one typically modeled via a reward function. If reward functions are expressed linearly, and the agent has previously learned a set…

Machine Learning · Computer Science 2022-06-24 Lucas N. Alegre , Ana L. C. Bazzan , Bruno C. da Silva

Safety Representations for Safer Policy Learning

Reinforcement learning algorithms typically necessitate extensive exploration of the state space to find optimal policies. However, in safety-critical applications, the risks associated with such exploration can lead to catastrophic…

Machine Learning · Computer Science 2025-02-28 Kaustubh Mani , Vincent Mai , Charlie Gauthier , Annie Chen , Samer Nashed , Liam Paull

Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation

Safe reinforcement learning (RL) is crucial for deploying RL agents in real-world applications, as it aims to maximize long-term rewards while satisfying safety constraints. However, safe RL often suffers from sample inefficiency, requiring…

Machine Learning · Computer Science 2024-06-03 Shangding Gu , Laixi Shi , Yuhao Ding , Alois Knoll , Costas Spanos , Adam Wierman , Ming Jin

FOSP: Fine-tuning Offline Safe Policy through World Models

Offline Safe Reinforcement Learning (RL) seeks to address safety constraints by learning from static datasets and restricting exploration. However, these approaches heavily rely on the dataset and struggle to generalize to unseen scenarios…

Robotics · Computer Science 2025-03-04 Chenyang Cao , Yucheng Xin , Silang Wu , Longxiang He , Zichen Yan , Junbo Tan , Xueqian Wang

Exchange Policy Optimization Algorithm for Semi-Infinite Safe Reinforcement Learning

Safe reinforcement learning (safe RL) aims to respect safety requirements while optimizing long-term performance. In many practical applications, however, the problem involves an infinite number of constraints, known as semi-infinite safe…

Machine Learning · Computer Science 2025-11-07 Jiaming Zhang , Yujie Yang , Haoning Wang , Liping Zhang , Shengbo Eben Li

Safe Reinforcement Learning with Dead-Ends Avoidance and Recovery

Safety is one of the main challenges in applying reinforcement learning to realistic environmental tasks. To ensure safety during and after training process, existing methods tend to adopt overly conservative policy to avoid unsafe…

Machine Learning · Computer Science 2023-06-27 Xiao Zhang , Hai Zhang , Hongtu Zhou , Chang Huang , Di Zhang , Chen Ye , Junqiao Zhao

Sampling-Based Safe Reinforcement Learning

Safe exploration remains a fundamental challenge in reinforcement learning (RL), limiting the deployment of RL agents in the real world. We propose Sampling-Based Safe Reinforcement Learning (SBSRL), a model-based RL algorithm that…

Machine Learning · Computer Science 2026-05-20 Luca Vignola , Bruce D. Lee , Manish Prajapat , Manuel Wendl , Melanie Zeilinger , Andreas Krause , Yarden As

Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis

This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL), such that the safety constraint violations are bounded at any point during learning. In a variety of RL applications the safety of the…

Machine Learning · Computer Science 2023-12-19 Rohan Mitta , Hosein Hasanbeig , Jun Wang , Daniel Kroening , Yiannis Kantaros , Alessandro Abate

Improving Safety in Deep Reinforcement Learning using Unsupervised Action Planning

One of the key challenges to deep reinforcement learning (deep RL) is to ensure safety at both training and testing phases. In this work, we propose a novel technique of unsupervised action planning to improve the safety of on-policy…

Robotics · Computer Science 2021-09-30 Hao-Lun Hsu , Qiuhua Huang , Sehoon Ha

Cautious Reinforcement Learning with Logical Constraints

This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process. Policies are synthesised to satisfy a goal,…

Machine Learning · Computer Science 2020-03-24 Mohammadhosein Hasanbeig , Alessandro Abate , Daniel Kroening

Safe Exploration of State and Action Spaces in Reinforcement Learning

In this paper, we consider the important problem of safe exploration in reinforcement learning. While reinforcement learning is well-suited to domains with complex transition dynamics and high-dimensional state-action spaces, an additional…

Machine Learning · Computer Science 2014-02-05 Javier Garcia , Fernando Fernandez

SPO: Sequential Monte Carlo Policy Optimisation

Leveraging planning during learning and decision-making is central to the long-term development of intelligent agents. Recent works have successfully combined tree-based search methods and self-play learning mechanisms to this end. However,…

Artificial Intelligence · Computer Science 2024-11-01 Matthew V Macfarlane , Edan Toledo , Donal Byrne , Paul Duckworth , Alexandre Laterre

Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones

Safety remains a central obstacle preventing widespread use of RL in the real world: learning new tasks in uncertain environments requires extensive exploration, but safety requires limiting exploration. We propose Recovery RL, an algorithm…

Machine Learning · Computer Science 2021-05-19 Brijen Thananjeyan , Ashwin Balakrishna , Suraj Nair , Michael Luo , Krishnan Srinivasan , Minho Hwang , Joseph E. Gonzalez , Julian Ibarz , Chelsea Finn , Ken Goldberg

Safe-To-Explore State Spaces: Ensuring Safe Exploration in Policy Search with Hierarchical Task Optimization

Policy search reinforcement learning allows robots to acquire skills by themselves. However, the learning procedure is inherently unsafe as the robot has no a-priori way to predict the consequences of the exploratory actions it takes.…

Robotics · Computer Science 2018-10-09 Jens Lundell , Robert Krug , Erik Schaffernicht , Todor Stoyanov , Ville Kyrki

Safe Exploration in Continuous Action Spaces

We address the problem of deploying a reinforcement learning (RL) agent on a physical system such as a datacenter cooling unit or robot, where critical constraints must never be violated. We show how to exploit the typically smooth dynamics…

Artificial Intelligence · Computer Science 2018-01-29 Gal Dalal , Krishnamurthy Dvijotham , Matej Vecerik , Todd Hester , Cosmin Paduraru , Yuval Tassa