Related papers: Solution Methods for Constrained Markov Decision P…

An Approximate Solution Method for Large Risk-Averse Markov Decision Processes

Stochastic domains often involve risk-averse decision makers. While recent work has focused on how to model risk in Markov decision processes using risk measures, it has not addressed the problem of solving large risk-averse formulations.…

Portfolio Management · Quantitative Finance 2012-10-19 Marek Petrik , Dharmashankar Subramanian

Constrained discounted Markov decision processes with Borel state spaces

We study discrete-time discounted constrained Markov decision processes (CMDPs) on Borel spaces with unbounded reward functions. In our approach the transition probability functions are weakly or set-wise continuous. The reward functions…

Optimization and Control · Mathematics 2019-03-29 Eugene A. Feinberg , Anna Jaśkiewicz , Andrzej S. Nowak

Stochastic dominance-constrained Markov decision processes

We are interested in risk constraints for infinite horizon discrete time Markov decision processes (MDPs). Starting with average reward MDPs, we show that increasing concave stochastic dominance constraints on the empirical distribution of…

Optimization and Control · Mathematics 2012-06-21 William B. Haskell , Rahul Jain

A convex programming approach for discrete-time Markov decision processes under the expected total reward criterion

In this work, we study discrete-time Markov decision processes (MDPs) under constraints with Borel state and action spaces and where all the performance functions have the same form of the expected total reward (ETR) criterion over the…

Probability · Mathematics 2019-05-10 F. Dufour , Alexandre Genadot

Computing monotone policies for Markov decision processes: a nearly-isotonic penalty approach

This paper discusses algorithms for solving Markov decision processes (MDPs) that have monotone optimal policies. We propose a two-stage alternating convex optimization scheme that can accelerate the search for an optimal policy by…

Systems and Control · Computer Science 2017-04-04 Robert Mattila , Cristian R. Rojas , Vikram Krishnamurthy , Bo Wahlberg

Scenario-Based Verification of Uncertain MDPs

We consider Markov decision processes (MDPs) in which the transition probabilities and rewards belong to an uncertainty set parametrized by a collection of random variables. The probability distributions for these random parameters are…

Logic in Computer Science · Computer Science 2020-02-26 Murat Cubuktepe , Nils Jansen , Sebastian Junges , Joost-Pieter Katoen , Ufuk Topcu

Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning

Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision…

Machine Learning · Computer Science 2025-05-26 Maximilian Nägele , Jan Olle , Thomas Fösel , Remmy Zen , Florian Marquardt

Model-Free Algorithm and Regret Analysis for MDPs with Long-Term Constraints

In the optimization of dynamical systems, the variables typically have constraints. Such problems can be modeled as a constrained Markov Decision Process (CMDP). This paper considers a model-free approach to the problem, where the…

Machine Learning · Computer Science 2021-02-02 Qinbo Bai , Vaneet Aggarwal , Ather Gattami

Solving Long-run Average Reward Robust MDPs via Stochastic Games

Markov decision processes (MDPs) provide a standard framework for sequential decision making under uncertainty. However, MDPs do not take uncertainty in transition probabilities into account. Robust Markov decision processes (RMDPs) address…

Artificial Intelligence · Computer Science 2024-05-01 Krishnendu Chatterjee , Ehsan Kafshdar Goharshady , Mehrdad Karrabi , Petr Novotný , Đorđe Žikelić

J-P: MDP. FP. PP.: Characterizing Total Expected Rewards in Markov Decision Processes as Least Fixed Points with an Application to Operational Semantics of Probabilistic Programs (Technical Report)

Markov decision processes (MDPs) with rewards are a widespread and well-studied model for systems that make both probabilistic and nondeterministic choices. A fundamental result about MDPs is that their minimal and maximal expected rewards…

Logic in Computer Science · Computer Science 2024-11-26 Kevin Batz , Benjamin Lucien Kaminski , Christoph Matheja , Tobias Winkler

Fast Online Exact Solutions for Deterministic MDPs with Sparse Rewards

Markov Decision Processes (MDPs) are a mathematical framework for modeling sequential decision making under uncertainty. The classical approaches for solving MDPs are well known and have been widely studied, some of which rely on…

Machine Learning · Computer Science 2018-05-18 Joshua R. Bertram , Xuxi Yang , Peng Wei

Multi-Objective Approaches to Markov Decision Processes with Uncertain Transition Parameters

Markov decision processes (MDPs) are a popular model for performance analysis and optimization of stochastic systems. The parameters of stochastic behavior of MDPs are estimates from empirical observations of a system; their values are not…

Artificial Intelligence · Computer Science 2017-10-26 Dimitri Scheftelowitsch , Peter Buchholz , Vahid Hashemi , Holger Hermanns

Discounted continuous-time constrained Markov decision processes in Polish spaces

This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be…

Probability · Mathematics 2012-01-04 Xianping Guo , Xinyuan Song

Reward is enough for convex MDPs

Maximising a cumulative reward function that is Markov and stationary, i.e., defined over state-action pairs and independent of time, is sufficient to capture many kinds of goals in a Markov decision process (MDP). However, not all goals…

Artificial Intelligence · Computer Science 2023-06-05 Tom Zahavy , Brendan O'Donoghue , Guillaume Desjardins , Satinder Singh

Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Constrained decision-making is essential for designing safe policies in real-world control systems, yet simulated environments often fail to capture real-world adversities. We consider the problem of learning a policy that will maximize the…

Machine Learning · Computer Science 2026-02-10 Sourav Ganguly , Kishan Panaganti , Arnob Ghosh , Adam Wierman

Algorithm for Constrained Markov Decision Process with Linear Convergence

The problem of constrained Markov decision process is considered. An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs (the number of constraints is relatively small). A new dual…

Optimization and Control · Mathematics 2022-10-21 Egor Gladin , Maksim Lavrik-Karmazin , Karina Zainullina , Varvara Rudenko , Alexander Gasnikov , Martin Takáč

Learning Constrained Markov Decision Processes With Non-stationary Rewards and Constraints

In constrained Markov decision processes (CMDPs) with adversarial rewards and constraints, a well-known impossibility result prevents any algorithm from attaining both sublinear regret and sublinear constraint violation, when competing…

Machine Learning · Computer Science 2024-09-27 Francesco Emanuele Stradi , Anna Lunghi , Matteo Castiglioni , Alberto Marchesi , Nicola Gatti

Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize…

Optimization and Control · Mathematics 2015-07-07 Mahmoud El Chamie , Behcet Acikmese

Constrained Risk-Averse Markov Decision Processes

We consider the problem of designing policies for Markov decision processes (MDPs) with dynamic coherent risk objectives and constraints. We begin by formulating the problem in a Lagrangian framework. Under the assumption that the risk…

Artificial Intelligence · Computer Science 2021-03-30 Mohamadreza Ahmadi , Ugo Rosolia , Michel D. Ingham , Richard M. Murray , Aaron D. Ames

Quickest Change Detection Approach to Optimal Control in Markov Decision Processes with Model Changes

Optimal control in non-stationary Markov decision processes (MDP) is a challenging problem. The aim in such a control problem is to maximize the long-term discounted reward when the transition dynamics or the reward function can change over…

Applications · Statistics 2017-03-03 Taposh Banerjee , Miao Liu , Jonathan P. How