Related papers: Optimizing Percentile Criterion Using Robust MDPs

Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs

Robust MDPs (RMDPs) can be used to compute policies with provable worst-case guarantees in reinforcement learning. The quality and robustness of an RMDP solution are determined by the ambiguity set---the set of plausible transition…

Machine Learning · Computer Science 2019-02-21 Marek Petrik , Reazul Hasan Russell

Optimizing Norm-Bounded Weighted Ambiguity Sets for Robust MDPs

Optimal policies in Markov decision processes (MDPs) are very sensitive to model misspecification. This raises serious concerns about deploying them in high-stake domains. Robust MDPs (RMDP) provide a promising framework to mitigate…

Machine Learning · Computer Science 2019-12-06 Reazul Hasan Russel , Bahram Behzadian , Marek Petrik

Percentile Criterion Optimization in Offline Reinforcement Learning

In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the \emph{percentile criterion}. The percentile criterion is approximately solved by constructing an…

Machine Learning · Computer Science 2024-04-09 Elita A. Lobo , Cyrus Cousins , Yair Zick , Marek Petrik

Tight Bayesian Ambiguity Sets for Robust MDPs

Robustness is important for sequential decision making in a stochastic dynamic environment with uncertain probabilistic parameters. We address the problem of using robust MDPs (RMDPs) to compute policies with provable worst-case guarantees…

Machine Learning · Computer Science 2018-11-16 Reazul Hasan Russel , Marek Petrik

Robust Constrained-MDPs: Soft-Constrained Robust Policy Optimization under Model Uncertainty

In this paper, we focus on the problem of robustifying reinforcement learning (RL) algorithms with respect to model uncertainties. Indeed, in the framework of model-based RL, we propose to merge the theory of constrained Markov decision…

Machine Learning · Computer Science 2020-10-13 Reazul Hasan Russel , Mouhacine Benosman , Jeroen Van Baar

Sample Complexity of Robust Reinforcement Learning with a Generative Model

The Robust Markov Decision Process (RMDP) framework focuses on designing control policies that are robust against the parameter uncertainties due to the mismatches between the simulator model and real-world settings. An RMDP problem is…

Machine Learning · Computer Science 2022-05-17 Kishan Panaganti , Dileep Kalathil

On the Complexity of Robust Markov Decision Processes and Bisimulation Metrics

Robust Markov decision processes (RMDPs) extend standard Markov decision processes (MDPs) to account for uncertainty in the transition probabilities. RMDPs have an uncertainty set that defines a set of possible transition functions, each of…

Logic in Computer Science · Computer Science 2026-04-30 Marnix Suilen , Guillermo A. Pérez

Robust Policy Optimization with Baseline Guarantees

Our goal is to compute a policy that guarantees improved return over a baseline policy even when the available MDP model is inaccurate. The inaccurate model may be constructed, for example, by system identification techniques when the true…

Optimization and Control · Mathematics 2015-06-17 Yinlam Chow , Marek Petrik , Mohammad Ghavamzadeh

Lyapunov Robust Constrained-MDPs: Soft-Constrained Robustly Stable Policy Optimization under Model Uncertainty

Safety and robustness are two desired properties for any reinforcement learning algorithm. CMDPs can handle additional safety constraints and RMDPs can perform well under model uncertainties. In this paper, we propose to unite these two…

Machine Learning · Computer Science 2021-08-21 Reazul Hasan Russel , Mouhacine Benosman , Jeroen Van Baar , Radu Corcodel

On the Complexity of Discounted Robust MDPs with $L_p$ Uncertainty Sets

A basic model in sequential decision making is the Markov decision process (MDP), which is extended to Robust MDPs (RMDPs) by allowing uncertainty in transition probabilities and optimizing against the worst-case transition probabilities…

Computational Complexity · Computer Science 2026-05-11 Ali Asadi , Krishnendu Chatterjee , Alipasha Montaseri , Ali Shafiee

Distributionally Robust Optimization for Sequential Decision Making

The distributionally robust Markov Decision Process (MDP) approach asks for a distributionally robust policy that achieves the maximal expected total reward under the most adversarial distribution of uncertain parameters. In this paper, we…

Systems and Control · Computer Science 2018-10-10 Zhi Chen , Pengqian Yu , William B. Haskell

Soft-Robust Algorithms for Batch Reinforcement Learning

In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the percentile criterion, which minimizes the probability of a catastrophic failure. Unfortunately,…

Machine Learning · Computer Science 2021-03-01 Elita A. Lobo , Mohammad Ghavamzadeh , Marek Petrik

Robust Entropy-regularized Markov Decision Processes

Stochastic and soft optimal policies resulting from entropy-regularized Markov decision processes (ER-MDP) are desirable for exploration and imitation learning applications. Motivated by the fact that such policies are sensitive with…

Machine Learning · Computer Science 2022-01-03 Tien Mai , Patrick Jaillet

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

This paper addresses the problem of model-free reinforcement learning for Robust Markov Decision Process (RMDP) with large state spaces. The goal of the RMDP framework is to find a policy that is robust against the parameter uncertainties…

Machine Learning · Computer Science 2021-02-15 Kishan Panaganti , Dileep Kalathil

Best-Effort Policies for Robust Markov Decision Processes

We study the common generalization of Markov decision processes (MDPs) with sets of transition probabilities, known as robust MDPs (RMDPs). A standard goal in RMDPs is to compute a policy that maximizes the expected return under an…

Artificial Intelligence · Computer Science 2025-11-20 Alessandro Abate , Thom Badings , Giuseppe De Giacomo , Francesco Fabiano

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account \emph{risk}, i.e., increased awareness of events of small probability and high consequences. Accordingly, the…

Artificial Intelligence · Computer Science 2017-04-07 Yinlam Chow , Mohammad Ghavamzadeh , Lucas Janson , Marco Pavone

Towards Minimax Optimality of Model-based Robust Reinforcement Learning

We study the sample complexity of obtaining an $\epsilon$-optimal policy in \emph{Robust} discounted Markov Decision Processes (RMDPs), given only access to a generative model of the nominal kernel. This problem is widely studied in the…

Machine Learning · Computer Science 2024-06-07 Pierre Clavier , Erwan Le Pennec , Matthieu Geist

From Semi-Infinite Constraints to Structured Robust Policies: Optimal Gain Selection for Financial Systems

This paper studies the robust optimal gain selection problem for financial trading systems, formulated within a \emph{double linear policy} framework, which allocates capital across long and short positions. The key objective is to…

Systems and Control · Electrical Eng. & Systems 2025-01-20 Chung-Han Hsieh

Constrained and Robust Policy Synthesis with Satisfiability-Modulo-Probabilistic-Model-Checking

The ability to compute reward-optimal policies for given and known finite Markov decision processes (MDPs) underpins a variety of applications across planning, controller synthesis, and verification. However, we often want policies (1) to…

Logic in Computer Science · Computer Science 2025-11-18 Linus Heck , Filip Macák , Milan Češka , Sebastian Junges

A Bayesian Approach to Robust Reinforcement Learning

Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior. In this framework, transitions are modeled as arbitrary elements of a known and properly structured uncertainty…

Machine Learning · Computer Science 2019-07-25 Esther Derman , Daniel Mankowitz , Timothy Mann , Shie Mannor