Related papers: Compactly Restrictable Metric Policy Optimization …

Economic Model Predictive Control as a Solution to Markov Decision Processes

Markov Decision Processes (MDPs) offer a fairly generic and powerful framework to discuss the notion of optimal policies for dynamic systems, in particular when the dynamics are stochastic. However, computing the optimal policy of an MDP…

Systems and Control · Electrical Eng. & Systems 2024-07-24 Dirk Reinhardt , Akhil S. Anand , Shambhuraj Sawant , Sebastien Gros

Equivalence of Optimality Criteria for Markov Decision Process and Model Predictive Control

This paper shows that the optimal policy and value functions of a Markov Decision Process (MDP), either discounted or not, can be captured by a finite-horizon undiscounted Optimal Control Problem (OCP), even if based on an inexact model.…

Systems and Control · Electrical Eng. & Systems 2023-02-08 Arash Bahari Kordabad , Mario Zanon , Sebastien Gros

Constrained Risk-Averse Markov Decision Processes

We consider the problem of designing policies for Markov decision processes (MDPs) with dynamic coherent risk objectives and constraints. We begin by formulating the problem in a Lagrangian framework. Under the assumption that the risk…

Artificial Intelligence · Computer Science 2021-03-30 Mohamadreza Ahmadi , Ugo Rosolia , Michel D. Ingham , Richard M. Murray , Aaron D. Ames

Robust Constrained-MDPs: Soft-Constrained Robust Policy Optimization under Model Uncertainty

In this paper, we focus on the problem of robustifying reinforcement learning (RL) algorithms with respect to model uncertainties. Indeed, in the framework of model-based RL, we propose to merge the theory of constrained Markov decision…

Machine Learning · Computer Science 2020-10-13 Reazul Hasan Russel , Mouhacine Benosman , Jeroen Van Baar

Finite-Horizon Markov Decision Processes with State Constraints

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (minimize…

Optimization and Control · Mathematics 2015-07-08 Mahmoud El Chamie , Behcet Acikmese

Multi-Objective Policy Gradients with Topological Constraints

Multi-objective optimization models that encode ordered sequential constraints provide a solution to model various challenging problems including encoding preferences, modeling a curriculum, and enforcing measures of safety. A recently…

Artificial Intelligence · Computer Science 2022-09-16 Kyle Hollins Wray , Stas Tiomkin , Mykel J. Kochenderfer , Pieter Abbeel

Risk-Averse Decision Making Under Uncertainty

A large class of decision making under uncertainty problems can be described via Markov decision processes (MDPs) or partially observable MDPs (POMDPs), with application to artificial intelligence and operations research, among others.…

Artificial Intelligence · Computer Science 2021-09-10 Mohamadreza Ahmadi , Ugo Rosolia , Michel D. Ingham , Richard M. Murray , Aaron D. Ames

Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize…

Optimization and Control · Mathematics 2015-07-07 Mahmoud El Chamie , Behcet Acikmese

Policy Testing in Markov Decision Processes

We study the policy testing problem in discounted Markov decision processes (MDPs) in the fixed-confidence setting under a generative model with static sampling. The goal is to decide whether the value of a given policy exceeds a specified…

Machine Learning · Statistics 2026-04-21 Kaito Ariu , Po-An Wang , Alexandre Proutiere , Kenshi Abe

Large-Scale Markov Decision Problems via the Linear Programming Dual

We consider the problem of controlling a fully specified Markov decision process (MDP), also known as the planning problem, when the state space is very large and calculating the optimal policy is intractable. Instead, we pursue the more…

Optimization and Control · Mathematics 2019-01-09 Yasin Abbasi-Yadkori , Peter L. Bartlett , Xi Chen , Alan Malek

Robustness to Modeling Errors in Risk-Sensitive Markov Decision Problems with Markov Risk Measures

We consider risk-sensitive Markov decision processes (MDPs), where the MDP model is influenced by a parameter which takes values in a compact metric space. We identify sufficient conditions under which small perturbations in the model…

Optimization and Control · Mathematics 2022-09-28 Shiping Shao , Abhishek Gupta , William B. Haskell

Constrained Markov decision processes for response-adaptive procedures in clinical trials with binary outcomes

A constrained Markov decision process (CMDP) approach is developed for response-adaptive procedures in clinical trials with binary outcomes. The resulting CMDP class of Bayesian response -- adaptive procedures can be used to target a…

Methodology · Statistics 2024-01-31 Stef Baas , Aleida Braaksma , Richard J. Boucherie

Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Constrained decision-making is essential for designing safe policies in real-world control systems, yet simulated environments often fail to capture real-world adversities. We consider the problem of learning a policy that will maximize the…

Machine Learning · Computer Science 2026-02-10 Sourav Ganguly , Kishan Panaganti , Arnob Ghosh , Adam Wierman

Myopic Policy Bounds for Information Acquisition POMDPs

This paper addresses the problem of optimal control of robotic sensing systems aimed at autonomous information gathering in scenarios such as environmental monitoring, search and rescue, and surveillance and reconnaissance. The information…

Systems and Control · Computer Science 2016-01-28 Mikko Lauri , Nikolay Atanasov , George J. Pappas , Risto Ritala

Constrained and Robust Policy Synthesis with Satisfiability-Modulo-Probabilistic-Model-Checking

The ability to compute reward-optimal policies for given and known finite Markov decision processes (MDPs) underpins a variety of applications across planning, controller synthesis, and verification. However, we often want policies (1) to…

Logic in Computer Science · Computer Science 2025-11-18 Linus Heck , Filip Macák , Milan Češka , Sebastian Junges

Linear Programming for Decision Processes with Partial Information

Markov Decision Processes (MDPs) are stochastic optimization problems that model situations where a decision maker controls a system based on its state. Partially observed Markov decision processes (POMDPs) are generalizations of MDPs where…

Optimization and Control · Mathematics 2019-03-26 Victor Cohen , Axel Parmentier

Multi-Objective Approaches to Markov Decision Processes with Uncertain Transition Parameters

Markov decision processes (MDPs) are a popular model for performance analysis and optimization of stochastic systems. The parameters of stochastic behavior of MDPs are estimates from empirical observations of a system; their values are not…

Artificial Intelligence · Computer Science 2017-10-26 Dimitri Scheftelowitsch , Peter Buchholz , Vahid Hashemi , Holger Hermanns

Robust Combination of Local Controllers

Planning problems are hard, motion planning, for example, isPSPACE-hard. Such problems are even more difficult in the presence of uncertainty. Although, Markov Decision Processes (MDPs) provide a formal framework for such problems, finding…

Artificial Intelligence · Computer Science 2013-01-14 Carlos E. Guestrin , Dirk Ormoneit

MDP Optimal Control under Temporal Logic Constraints

In this paper, we develop a method to automatically generate a control policy for a dynamical system modeled as a Markov Decision Process (MDP). The control specification is given as a Linear Temporal Logic (LTL) formula over a set of…

Robotics · Computer Science 2011-03-24 Xu Chu Ding , Stephen L. Smith , Calin Belta , Daniela Rus

Risk-Averse $\omega$-regular Markov Decision Process Control

Many control problems in environments that can be modeled as Markov decision processes (MDPs) concern infinite-time horizon specifications. The classical aim in this context is to compute a control policy that maximizes the probability of…

Systems and Control · Computer Science 2017-05-03 Ruediger Ehlers , Salar Moarref , Ufuk Topcu