Related papers: Robust Markov Decision Process: Beyond Rectangular…
Markov decision processes (MDPs) are formal models commonly used in sequential decision-making. MDPs capture the stochasticity that may arise, for instance, from imprecise actuators via probabilities in the transition function. However, in…
Markov decision processes (MDP) are a well-established model for sequential decision-making in the presence of probabilities. In robust MDP (RMDP), every action is associated with an uncertainty set of probability distributions, modelling…
The distributionally robust Markov Decision Process (MDP) approach asks for a distributionally robust policy that achieves the maximal expected total reward under the most adversarial distribution of uncertain parameters. In this paper, we…
We study infinite-horizon robust Markov decision processes (MDPs) on continuous state spaces with structured rectangular ambiguity set. The proposed ambiguity set falls within the convex hull of unknown generating kernels. We utilize the…
This paper studies Markov Decision Processes under parameter uncertainty. We adapt the distributionally robust optimization framework, and assume that the uncertain parameters are random variables following an unknown distribution, and…
Markov decision processes (MDPs) are a standard model for sequential decision-making problems and are widely used across many scientific areas, including formal methods and artificial intelligence (AI). MDPs do, however, come with the…
Robust Markov Decision Processes (MDPs) are receiving much attention in learning a robust policy which is less sensitive to environment changes. There are an increasing number of works analyzing sample-efficiency of robust MDPs. However,…
Fueled by advances in both robust optimization theory and reinforcement learning (RL), robust Markov Decision Processes (RMDPs) have garnered increasing attention due to their powerful capability for sequential decision-making under…
Synthesising verifiably correct controllers for dynamical systems is crucial for safety-critical problems. To achieve this, it is important to account for uncertainty in a robust manner, while at the same time it is often of interest to…
This paper studies the computation of robust deterministic policies for Markov Decision Processes (MDPs) in the Lightning Does Not Strike Twice (LDST) model of Mannor, Mebel and Xu (ICML '12). In this model, designed to provide robustness…
Stochastic and soft optimal policies resulting from entropy-regularized Markov decision processes (ER-MDP) are desirable for exploration and imitation learning applications. Motivated by the fact that such policies are sensitive with…
We study the common generalization of Markov decision processes (MDPs) with sets of transition probabilities, known as robust MDPs (RMDPs). A standard goal in RMDPs is to compute a policy that maximizes the expected return under an…
Robust Markov decision processes (r-MDPs) extend MDPs by explicitly modelling epistemic uncertainty about transition dynamics. Learning r-MDPs from interactions with an unknown environment enables the synthesis of robust policies with…
We propose policy gradient algorithms for robust infinite-horizon Markov decision processes (MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the robust MDP literature. Indeed, uncertainty sets that…
We consider risk-sensitive Markov decision processes (MDPs), where the MDP model is influenced by a parameter which takes values in a compact metric space. We identify sufficient conditions under which small perturbations in the model…
This paper addresses a key limitation in existing counterfactual inference methods for Markov Decision Processes (MDPs). Current approaches assume a specific causal model to make counterfactuals identifiable. However, there are usually many…
In this work, we consider an online robust Markov Decision Process (MDP) where we have the information of finitely many prototypes of the underlying transition kernel. We consider an adaptively updated ambiguity set of the prototypes and…
Robust Markov decision processes (MDPs) have attracted significant interest due to their ability to protect MDPs from poor out-of-sample performance in the presence of ambiguity. In contrast to classical MDPs, which account for…
In robust Markov decision processes (MDPs), the uncertainty in the transition kernel is addressed by finding a policy that optimizes the worst-case performance over an uncertainty set of MDPs. While much of the literature has focused on…
Markov decision processes (MDPs) are a popular model for performance analysis and optimization of stochastic systems. The parameters of stochastic behavior of MDPs are estimates from empirical observations of a system; their values are not…