Related papers: Model-based Safe Deep Reinforcement Learning via a…

Safe Continuous Control with Constrained Model-Based Policy Optimization

The applicability of reinforcement learning (RL) algorithms in real-world domains often requires adherence to safety constraints, a need difficult to address given the asymptotic nature of the classic RL optimization objective. In contrast…

Machine Learning · Computer Science 2021-04-15 Moritz A. Zanger , Karam Daaboul , J. Marius Zöllner

Constrained Markov Decision Processes via Backward Value Functions

Although Reinforcement Learning (RL) algorithms have found tremendous success in simulated domains, they often cannot directly be applied to physical systems, especially in cases where there are hard constraints to satisfy (e.g. on safety…

Machine Learning · Computer Science 2020-08-28 Harsh Satija , Philip Amortila , Joelle Pineau

Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs

Many physical systems have underlying safety considerations that require that the policy employed ensures the satisfaction of a set of constraints. The analytical formulation usually takes the form of a Constrained Markov Decision Process…

Machine Learning · Computer Science 2021-03-03 Aria HasanzadeZonuzy , Archana Bura , Dileep Kalathil , Srinivas Shakkottai

Robust Constrained-MDPs: Soft-Constrained Robust Policy Optimization under Model Uncertainty

In this paper, we focus on the problem of robustifying reinforcement learning (RL) algorithms with respect to model uncertainties. Indeed, in the framework of model-based RL, we propose to merge the theory of constrained Markov decision…

Machine Learning · Computer Science 2020-10-13 Reazul Hasan Russel , Mouhacine Benosman , Jeroen Van Baar

Safe Policies for Reinforcement Learning via Primal-Dual Methods

In this paper, we study the learning of safe policies in the setting of reinforcement learning problems. This is, we aim to control a Markov Decision Process (MDP) of which we do not know the transition probabilities, but we have access to…

Systems and Control · Electrical Eng. & Systems 2022-01-14 Santiago Paternain , Miguel Calvo-Fullana , Luiz F. O. Chamon , Alejandro Ribeiro

Offline Safe Policy Optimization From Heterogeneous Feedback

Offline Preference-based Reinforcement Learning (PbRL) learns rewards and policies aligned with human preferences without the need for extensive reward engineering and direct interaction with human annotators. However, ensuring safety…

Artificial Intelligence · Computer Science 2025-12-24 Ze Gong , Pradeep Varakantham , Akshat Kumar

A Safe Reinforcement Learning Algorithm for Supervisory Control of Power Plants

Traditional control theory-based methods require tailored engineering for each system and constant fine-tuning. In power plant control, one often needs to obtain a precise representation of the system dynamics and carefully design the…

Systems and Control · Electrical Eng. & Systems 2024-09-21 Yixuan Sun , Sami Khairy , Richard B. Vilim , Rui Hu , Akshay J. Dave

Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning

In safe offline reinforcement learning (RL), the objective is to develop a policy that maximizes cumulative rewards while strictly adhering to safety constraints, utilizing only offline data. Traditional methods often face difficulties in…

Machine Learning · Computer Science 2026-02-11 Prajwal Koirala , Zhanhong Jiang , Soumik Sarkar , Cody Fleming

Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time

In this paper, we present an online reinforcement learning algorithm for constrained Markov decision processes with a safety constraint. Despite the necessary attention of the scientific community, considering stochastic stopping time, the…

Machine Learning · Computer Science 2024-03-26 Abhijit Mazumdar , Rafal Wisniewski , Manuela L. Bujorianu

Beyond Hard Constraints: Budget-Conditioned Reachability For Safe Offline Reinforcement Learning

Sequential decision making using Markov Decision Process underpins many realworld applications. Both model-based and model free methods have achieved strong results in these settings. However, real-world tasks must balance reward…

Machine Learning · Computer Science 2026-04-01 Janaka Chathuranga Brahmanage , Akshat Kumar

Lyapunov-based Safe Policy Optimization for Continuous Control

We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i.e.,~policies that do not take the agent to undesirable situations. We formulate…

Machine Learning · Computer Science 2019-02-13 Yinlam Chow , Ofir Nachum , Aleksandra Faust , Edgar Duenez-Guzman , Mohammad Ghavamzadeh

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments

It is quite challenging to ensure the safety of reinforcement learning (RL) agents in an unknown and stochastic environment under hard constraints that require the system state not to reach certain specified unsafe regions. Many popular…

Systems and Control · Electrical Eng. & Systems 2023-06-14 Yixuan Wang , Simon Sinong Zhan , Ruochen Jiao , Zhilu Wang , Wanxin Jin , Zhuoran Yang , Zhaoran Wang , Chao Huang , Qi Zhu

Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms

Reinforcement Learning (RL) serves as a versatile framework for sequential decision-making, finding applications across diverse domains such as robotics, autonomous driving, recommendation systems, supply chain optimization, biology,…

Machine Learning · Computer Science 2024-08-26 Vaneet Aggarwal , Washim Uddin Mondal , Qinbo Bai

Safe Reinforcement Learning Using Advantage-Based Intervention

Many sequential decision problems involve finding a policy that maximizes total reward while obeying safety constraints. Although much recent research has focused on the development of safe reinforcement learning (RL) algorithms that…

Machine Learning · Computer Science 2021-07-20 Nolan Wagener , Byron Boots , Ching-An Cheng

Safety-Constrained Policy Transfer with Successor Features

In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications…

Machine Learning · Computer Science 2022-11-11 Zeyu Feng , Bowen Zhang , Jianxin Bi , Harold Soh

Evaluation of Constrained Reinforcement Learning Algorithms for Legged Locomotion

Shifting from traditional control strategies to Deep Reinforcement Learning (RL) for legged robots poses inherent challenges, especially when addressing real-world physical constraints during training. While high-fidelity simulations…

Robotics · Computer Science 2023-09-28 Joonho Lee , Lukas Schroth , Victor Klemm , Marko Bjelonic , Alexander Reske , Marco Hutter

Trajectory-Oriented Policy Optimization with Sparse Rewards

Mastering deep reinforcement learning (DRL) proves challenging in tasks featuring scant rewards. These limited rewards merely signify whether the task is partially or entirely accomplished, necessitating various exploration actions before…

Machine Learning · Computer Science 2024-04-11 Guojian Wang , Faguo Wu , Xiao Zhang

Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Constrained decision-making is essential for designing safe policies in real-world control systems, yet simulated environments often fail to capture real-world adversities. We consider the problem of learning a policy that will maximize the…

Machine Learning · Computer Science 2026-02-10 Sourav Ganguly , Kishan Panaganti , Arnob Ghosh , Adam Wierman

A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning

Although well-established in general reinforcement learning (RL), value-based methods are rarely explored in constrained RL (CRL) for their incapability of finding policies that can randomize among multiple actions. To apply value-based…

Machine Learning · Computer Science 2022-06-28 Tianchi Cai , Wenpeng Zhang , Lihong Gu , Xiaodong Zeng , Jinjie Gu

Constrained Variational Policy Optimization for Safe Reinforcement Learning

Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying them to safety-critical applications. Previous primal-dual style approaches suffer from instability issues and lack optimality…

Machine Learning · Computer Science 2022-06-20 Zuxin Liu , Zhepeng Cen , Vladislav Isenbaev , Wei Liu , Zhiwei Steven Wu , Bo Li , Ding Zhao