Related papers: Stability-Constrained Markov Decision Processes Us…

Functional Stability of Discounted Markov Decision Processes Using Economic MPC Dissipativity Theory

This paper discusses the functional stability of closed-loop Markov Chains under optimal policies resulting from a discounted optimality criterion, forming Markov Decision Processes (MDPs). We investigate the stability of MDPs in the sense…

Systems and Control · Electrical Eng. & Systems 2022-04-01 Arash Bahari Kordabad , Sebastien Gros

Economic MPC of Markov Decision Processes: Dissipativity in Undiscounted Infinite-Horizon Optimal Control

Economic Model Predictive Control (MPC) dissipativity theory is central to discussing the stability of policies resulting from minimizing economic stage costs. In its current form, the dissipativity theory for economic MPC applies to…

Systems and Control · Electrical Eng. & Systems 2022-07-25 Sébastien Gros , Mario Zanon

Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes

Markov decision processes (MDPs) are the defacto frame-work for sequential decision making in the presence ofstochastic uncertainty. A classical optimization criterion forMDPs is to maximize the expected discounted-sum pay-off, which…

Artificial Intelligence · Computer Science 2020-02-28 Tomas Brazdil , Krishnendu Chatterjee , Petr Novotny , Jiri Vahala

Near Optimality of Quantized Policies in Stochastic Control Under Weak Continuity Conditions

This paper studies the approximation of optimal control policies by quantized (discretized) policies for a very general class of Markov decision processes (MDPs). The problem is motivated by applications in networked control systems,…

Optimization and Control · Mathematics 2015-05-14 Naci Saldi , Serdar Yüksel , Tamás Linder

Policy stability and ultimate stationarity in discounted risk-sensitive stochastic control

We study discrete-time Markov Decision Processes (MDPs) on finite state-action spaces and analyze the stability of optimal policies and value functions in the long-run discounted risk-sensitive objective setting. Our analysis addresses…

Optimization and Control · Mathematics 2026-01-13 Nicole Bäuerle , Marcin Pitera , Łukasz Stettner

Reduction of total-cost and average-cost MDPs with weakly continuous transition probabilities to discounted MDPs

This note describes sufficient conditions under which total-cost and average-cost Markov decision processes (MDPs) with general state and action spaces, and with weakly continuous transition probabilities, can be reduced to discounted MDPs.…

Optimization and Control · Mathematics 2017-11-21 Eugene A. Feinberg , Jefferson Huang

Economic Model Predictive Control as a Solution to Markov Decision Processes

Markov Decision Processes (MDPs) offer a fairly generic and powerful framework to discuss the notion of optimal policies for dynamic systems, in particular when the dynamics are stochastic. However, computing the optimal policy of an MDP…

Systems and Control · Electrical Eng. & Systems 2024-07-24 Dirk Reinhardt , Akhil S. Anand , Shambhuraj Sawant , Sebastien Gros

Policy Testing in Markov Decision Processes

We study the policy testing problem in discounted Markov decision processes (MDPs) in the fixed-confidence setting under a generative model with static sampling. The goal is to decide whether the value of a given policy exceeds a specified…

Machine Learning · Statistics 2026-04-21 Kaito Ariu , Po-An Wang , Alexandre Proutiere , Kenshi Abe

Equivalence of Optimality Criteria for Markov Decision Process and Model Predictive Control

This paper shows that the optimal policy and value functions of a Markov Decision Process (MDP), either discounted or not, can be captured by a finite-horizon undiscounted Optimal Control Problem (OCP), even if based on an inexact model.…

Systems and Control · Electrical Eng. & Systems 2023-02-08 Arash Bahari Kordabad , Mario Zanon , Sebastien Gros

Discounted continuous-time constrained Markov decision processes in Polish spaces

This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be…

Probability · Mathematics 2012-01-04 Xianping Guo , Xinyuan Song

Constrained and Robust Policy Synthesis with Satisfiability-Modulo-Probabilistic-Model-Checking

The ability to compute reward-optimal policies for given and known finite Markov decision processes (MDPs) underpins a variety of applications across planning, controller synthesis, and verification. However, we often want policies (1) to…

Logic in Computer Science · Computer Science 2025-11-18 Linus Heck , Filip Macák , Milan Češka , Sebastian Junges

Scaling Up Robust MDPs by Reinforcement Learning

We consider large-scale Markov decision processes (MDPs) with parameter uncertainty, under the robust MDP paradigm. Previous studies showed that robust MDPs, based on a minimax approach to handle uncertainty, can be solved using dynamic…

Machine Learning · Computer Science 2013-06-27 Aviv Tamar , Huan Xu , Shie Mannor

Reconnaissance and Planning algorithm for constrained MDP

Practical reinforcement learning problems are often formulated as constrained Markov decision process (CMDP) problems, in which the agent has to maximize the expected return while satisfying a set of prescribed safety constraints. In this…

Machine Learning · Computer Science 2019-09-23 Shin-ichi Maeda , Hayato Watahiki , Shintarou Okada , Masanori Koyama

Constrained Stochastic Optimal Control with a Baseline Performance Guarantee

In this paper, we show how a simulated Markov decision process (MDP) built by the so-called \emph{baseline} policies, can be used to compute a different policy, namely the \emph{simulated optimal} policy, for which the performance of this…

Optimization and Control · Mathematics 2014-10-13 Yinlam Chow , Mohammad Ghavamzadeh

On Minimizing Total Discounted Cost in MDPs Subject to Reachability Constraints

We study the synthesis of a policy in a Markov decision process (MDP) following which an agent reaches a target state in the MDP while minimizing its total discounted cost. The problem combines a reachability criterion with a discounted…

Optimization and Control · Mathematics 2021-03-18 Yagiz Savas , Christos K. Verginis , Michael Hibbard , Ufuk Topcu

Compositional Planning for Logically Constrained Multi-Agent Markov Decision Processes

Designing control policies for large, distributed systems is challenging, especially in the context of critical, temporal logic based specifications (e.g., safety) that must be met with high probability. Compositional methods for such…

Systems and Control · Electrical Eng. & Systems 2024-10-08 Krishna C. Kalagarla , Matthew Low , Rahul Jain , Ashutosh Nayyar , Pierluigi Nuzzo

Anytime-Constrained Reinforcement Learning

We introduce and study constrained Markov Decision Processes (cMDPs) with anytime constraints. An anytime constraint requires the agent to never violate its budget at any point in time, almost surely. Although Markovian policies are no…

Machine Learning · Computer Science 2024-06-14 Jeremy McMahan , Xiaojin Zhu

Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning

Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision…

Machine Learning · Computer Science 2025-05-26 Maximilian Nägele , Jan Olle , Thomas Fösel , Remmy Zen , Florian Marquardt

A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning

We propose and study a general framework for regularized Markov decision processes (MDPs) where the goal is to find an optimal policy that maximizes the expected discounted total reward plus a policy regularization term. The extant…

Machine Learning · Statistics 2019-10-22 Xiang Li , Wenhao Yang , Zhihua Zhang

Reinforcement Learning of Markov Decision Processes with Peak Constraints

In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take…

Optimization and Control · Mathematics 2019-12-09 Ather Gattami