Related papers: Predictable Interval MDPs through Entropy Regulari…
Interval Markov decision processes are a class of Markov models where the transition probabilities between the states belong to intervals. In this paper, we study the problem of efficient estimation of the optimal policies in Interval…
Stochastic and soft optimal policies resulting from entropy-regularized Markov decision processes (ER-MDP) are desirable for exploration and imitation learning applications. Motivated by the fact that such policies are sensitive with…
Interval Markov Decision Processes (IMDPs) are finite-state uncertain Markov models, where the transition probabilities belong to intervals. Recently, there has been a surge of research on employing IMDPs as abstractions of stochastic…
We present an optimization-based framework for robust permissive synthesis for Interval Markov Decision Processes (IMDPs), motivated by robotic decision-making under transition uncertainty. In many robotic systems, model inaccuracies and…
This paper considers an infinite-horizon Markov decision process (MDP) that allows for general non-exponential discount functions, in both discrete and continuous time. Due to the inherent time inconsistency, we look for a randomized…
We propose a new method for optimistic planning in infinite-horizon discounted Markov decision processes based on the idea of adding regularization to the updates of an otherwise standard approximate value iteration procedure. This…
We propose a general framework for entropy-regularized average-reward reinforcement learning in Markov decision processes (MDPs). Our approach is based on extending the linear-programming formulation of policy optimization in MDPs to…
We study the problem of synthesizing a policy that maximizes the entropy of a Markov decision process (MDP) subject to a temporal logic constraint. Such a policy minimizes the predictability of the paths it generates, or dually, maximizes…
In this paper, we consider a class of continuous-time, continuous-space stochastic optimal control problems. Building upon recent advances in Markov chain approximation methods and sampling-based algorithms for deterministic path planning,…
We investigate the problem of synthesizing optimal control policies for Markov decision processes (MDPs) with both qualitative and quantitative objectives. Specifically, our goal is to achieve a given linear temporal logic (LTL) task with…
Robust Markov decision processes (MDPs) provide a general framework to model decision problems where the system dynamics are changing or only partially known. Efficient methods for some \texttt{sa}-rectangular robust MDPs exist, using its…
Robust Markov decision processes (MDPs) aim to handle changing or partially known system dynamics. To solve them, one typically resorts to robust optimization methods. However, this significantly increases computational complexity and…
Markov Decision Processes (MDPs) offer a fairly generic and powerful framework to discuss the notion of optimal policies for dynamic systems, in particular when the dynamics are stochastic. However, computing the optimal policy of an MDP…
We propose and study a general framework for regularized Markov decision processes (MDPs) where the goal is to find an optimal policy that maximizes the expected discounted total reward plus a policy regularization term. The extant…
Robust Markov decision processes (MDPs) aim to handle changing or partially known system dynamics. To solve them, one typically resorts to robust optimization methods. However, this significantly increases computational complexity and…
Safety in stochastic control systems, which are subject to random noise with a known probability distribution, aims to compute policies that satisfy predefined operational constraints with high confidence throughout the uncertain evolution…
Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision problems in probabilistic reactive systems. We consider parametric MDPs (pMDPs) that include parameters in some of the transition…
Markov decision processes (MDPs) are a popular model for performance analysis and optimization of stochastic systems. The parameters of stochastic behavior of MDPs are estimates from empirical observations of a system; their values are not…
Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize…
The deployment of autonomous systems in safety-critical environments requires control policies that guarantee satisfaction of complex control specifications. These systems are commonly modeled as nonlinear discrete-time stochastic systems.…