Related papers: Solving Minimum-Cost Reach Avoid using Reinforceme…

Stochastic Minimum-Cost Reach-Avoid Reinforcement Learning

We study stochastic minimum-cost reach-avoid reinforcement learning, where an agent must satisfy a reach-avoid specification with probability at least $p$ while minimizing expected cumulative costs in stochastic environments. Existing safe…

Machine Learning · Computer Science 2026-05-19 Jingduo Pan , Taoran Wu , Yiling Xue , Bai Xue

Iterative Reachability Estimation for Safe Reinforcement Learning

Ensuring safety is important for the practical deployment of reinforcement learning (RL). Various challenges must be addressed, such as handling stochasticity in the environments, providing rigorous guarantees of persistent state-wise…

Machine Learning · Computer Science 2023-09-26 Milan Ganai , Zheng Gong , Chenning Yu , Sylvia Herbert , Sicun Gao

Reward Constrained Policy Optimization

Solving tasks in Reinforcement Learning is no easy feat. As the goal of the agent is to maximize the accumulated reward, it often learns to exploit loopholes and misspecifications in the reward signal resulting in unwanted behavior. While…

Machine Learning · Computer Science 2018-12-27 Chen Tessler , Daniel J. Mankowitz , Shie Mannor

Constrained Reinforcement Learning Under Model Mismatch

Existing studies on constrained reinforcement learning (RL) may obtain a well-performing policy in the training environment. However, when deployed in a real environment, it may easily violate constraints that were originally satisfied…

Machine Learning · Computer Science 2024-05-06 Zhongchang Sun , Sihong He , Fei Miao , Shaofeng Zou

Adversarial Constrained Policy Optimization: Improving Constrained Reinforcement Learning by Adapting Budgets

Constrained reinforcement learning has achieved promising progress in safety-critical fields where both rewards and constraints are considered. However, constrained reinforcement learning methods face challenges in striking the right…

Machine Learning · Computer Science 2024-10-29 Jianmina Ma , Jingtian Ji , Yue Gao

A Minimum Discounted Reward Hamilton-Jacobi Formulation for Computing Reachable Sets

We propose a novel formulation for approximating reachable sets through a minimum discounted reward optimal control problem. The formulation yields a continuous solution that can be obtained by solving a Hamilton-Jacobi equation.…

Optimization and Control · Mathematics 2018-09-05 Anayo K. Akametalu , Shromona Ghosh , Jaime F. Fisac , Claire J. Tomlin

IPPO: Obstacle Avoidance for Robotic Manipulators in Joint Space via Improved Proximal Policy Optimization

Reaching tasks with random targets and obstacles is a challenging task for robotic manipulators. In this study, we propose a novel model-free reinforcement learning approach based on proximal policy optimization (PPO) for training a deep…

Robotics · Computer Science 2023-02-10 Yongliang Wang , Hamidreza Kasaei

Dual-Objective Reinforcement Learning with Novel Hamilton-Jacobi-Bellman Formulations

Hard constraints in reinforcement learning (RL) often degrade policy performance. Lagrangian methods offer a way to blend objectives with constraints, but require intricate reward engineering and parameter tuning. In this work, we extend…

Artificial Intelligence · Computer Science 2025-12-05 William Sharpless , Dylan Hirsch , Sander Tonkens , Nikhil Shinde , Sylvia Herbert

Safety and Liveness Guarantees through Reach-Avoid Reinforcement Learning

Reach-avoid optimal control problems, in which the system must reach certain goal conditions while staying clear of unacceptable failure modes, are central to safety and liveness assurance for autonomous robotic systems, but their exact…

Machine Learning · Computer Science 2022-01-25 Kai-Chieh Hsu , Vicenç Rubies-Royo , Claire J. Tomlin , Jaime F. Fisac

A reinforcement learning approach to hybrid control design

In this paper we design hybrid control policies for hybrid systems whose mathematical models are unknown. Our contributions are threefold. First, we propose a framework for modelling the hybrid control design problem as a single Markov…

Systems and Control · Electrical Eng. & Systems 2020-09-03 Meet Gandhi , Atreyee Kundu , Shalabh Bhatnagar

Neural Lyapunov and Optimal Control

Despite impressive results, reinforcement learning (RL) suffers from slow convergence and requires a large variety of tuning strategies. In this paper, we investigate the ability of RL algorithms on simple continuous control tasks. We show…

Robotics · Computer Science 2024-02-16 Daniel Layeghi , Steve Tonneau , Michael Mistry

Hamilton-Jacobi Reachability in Reinforcement Learning: A Survey

Recent literature has proposed approaches that learn control policies with high performance while maintaining safety guarantees. Synthesizing Hamilton-Jacobi (HJ) reachable sets has become an effective tool for verifying safety and…

Systems and Control · Electrical Eng. & Systems 2024-08-23 Milan Ganai , Sicun Gao , Sylvia Herbert

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

Safe reinforcement learning aims to learn the optimal policy while satisfying safety constraints, which is essential in real-world applications. However, current algorithms still struggle for efficient policy updates with hard constraint…

Machine Learning · Computer Science 2022-06-20 Linrui Zhang , Li Shen , Long Yang , Shixiang Chen , Bo Yuan , Xueqian Wang , Dacheng Tao

Learning Robust Policies via Interpretable Hamilton-Jacobi Reachability-Guided Disturbances

Deep Reinforcement Learning (RL) has shown remarkable success in robotics with complex and heterogeneous dynamics. However, its vulnerability to unknown disturbances and adversarial attacks remains a significant challenge. In this paper, we…

Robotics · Computer Science 2024-10-01 Hanyang Hu , Xilun Zhang , Xubo Lyu , Mo Chen

Beyond Hard Constraints: Budget-Conditioned Reachability For Safe Offline Reinforcement Learning

Sequential decision making using Markov Decision Process underpins many realworld applications. Both model-based and model free methods have achieved strong results in these settings. However, real-world tasks must balance reward…

Machine Learning · Computer Science 2026-04-01 Janaka Chathuranga Brahmanage , Akshat Kumar

MC-CPO: Mastery-Conditioned Constrained Policy Optimization

Engagement-optimized adaptive tutoring systems may prioritize short-term behavioral signals over sustained learning outcomes, creating structural incentives for reward hacking in reinforcement learning policies. We formalize this challenge…

Artificial Intelligence · Computer Science 2026-04-07 Oluseyi Olukola , Nick Rahimi

Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning

We propose a successive convex approximation based off-policy optimization (SCAOPO) algorithm to solve the general constrained reinforcement learning problem, which is formulated as a constrained Markov decision process (CMDP) in the…

Machine Learning · Computer Science 2022-04-20 Chang Tian , An Liu , Guang Huang , Wu Luo

Sub-policy Adaptation for Hierarchical Reinforcement Learning

Hierarchical reinforcement learning is a promising approach to tackle long-horizon decision-making problems with sparse rewards. Unfortunately, most methods still decouple the lower-level skill acquisition process and the training of a…

Machine Learning · Computer Science 2020-05-15 Alexander C. Li , Carlos Florensa , Ignasi Clavera , Pieter Abbeel

Distributional constrained reinforcement learning for supply chain optimization

This work studies reinforcement learning (RL) in the context of multi-period supply chains subject to constraints, e.g., on production and inventory. We introduce Distributional Constrained Policy Optimization (DCPO), a novel approach for…

Machine Learning · Computer Science 2023-02-06 Jaime Sabal Bermúdez , Antonio del Rio Chanona , Calvin Tsay

Solving Richly Constrained Reinforcement Learning through State Augmentation and Reward Penalties

Constrained Reinforcement Learning has been employed to enforce safety constraints on policy through the use of expected cost constraints. The key challenge is in handling expected cost accumulated using the policy and not just in a single…

Machine Learning · Computer Science 2024-01-17 Hao Jiang , Tien Mai , Pradeep Varakantham , Minh Huy Hoang