Related papers: A Mean Field Approach for Optimization in Particle…

Finite-Horizon Markov Decision Processes with State Constraints

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (minimize…

Optimization and Control · Mathematics 2015-07-08 Mahmoud El Chamie , Behcet Acikmese

Asymptotic Optimality of Finite Approximations to Markov Decision Processes with Borel Spaces

Calculating optimal policies is known to be computationally difficult for Markov decision processes (MDPs) with Borel state and action spaces. This paper studies finite-state approximations of discrete time Markov decision processes with…

Optimization and Control · Mathematics 2016-09-23 Naci Saldi , Serdar Yüksel , Tamás Linder

Finite-State Approximations to Discounted and Average Cost Constrained Markov Decision Processes

In this paper, we consider the finite-state approximation of a discrete-time constrained Markov decision process (MDP) under the discounted and average cost criteria. Using the linear programming formulation of the constrained discounted…

Optimization and Control · Mathematics 2018-07-10 Naci Saldi

Mean-Variance Optimization of Discrete Time Discounted Markov Decision Processes

In this paper, we study a mean-variance optimization problem in an infinite horizon discrete time discounted Markov decision process (MDP). The objective is to minimize the variance of system rewards with the constraint of mean performance.…

Optimization and Control · Mathematics 2017-08-24 Li Xia

Optimizing Quantiles in Preference-based Markov Decision Processes

In the Markov decision process model, policies are usually evaluated by expected cumulative rewards. As this decision criterion is not always suitable, we propose in this paper an algorithm for computing a policy optimal for the quantile…

Artificial Intelligence · Computer Science 2016-12-02 Hugo Gilbert , Paul Weng , Yan Xu

Continuous-time mean field Markov decision models

We consider a finite number of $N$ statistically equal agents, each moving on a finite set of states according to a continuous-time Markov Decision Process (MDP). Transition intensities of the agents and generated rewards depend not only on…

Probability · Mathematics 2025-09-23 Nicole Bäuerle , Sebastian Höfer

On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes

We present the first finite time global convergence analysis of policy gradient in the context of infinite horizon average reward Markov decision processes (MDPs). Specifically, we focus on ergodic tabular MDPs with finite state and action…

Machine Learning · Computer Science 2024-03-12 Navdeep Kumar , Yashaswini Murthy , Itai Shufaro , Kfir Y. Levy , R. Srikant , Shie Mannor

Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize…

Optimization and Control · Mathematics 2015-07-07 Mahmoud El Chamie , Behcet Acikmese

Average-Cost MDPs with Infinite State and Action Sets: New Sufficient Conditions for Optimality Inequalities and Equations

This paper studies discrete-time average-cost infinite-horizon Markov decision processes (MDPs) with Borel state and action sets. It introduces new sufficient conditions for { the} validity of optimality inequalities and optimality…

Optimization and Control · Mathematics 2025-01-28 Eugene A. Feinberg , Pavlo O. Kasyanov , Liliia S. Paliichuk

Risk-Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance

This paper investigates the optimization problem of an infinite stage discrete time Markov decision process (MDP) with a long-run average metric considering both mean and variance of rewards together. Such performance metric is important…

Optimization and Control · Mathematics 2020-08-11 Li Xia

Optimal control policies for resource allocation in the Cloud: comparison between Markov decision process and heuristic approaches

We consider an auto-scaling technique in a cloud system where virtual machines hosted on a physical node are turned on and off depending on the queue's occupation (or thresholds), in order to minimise a global cost integrating both energy…

Optimization and Control · Mathematics 2021-07-26 Thomas Tournaire , Hind Castel-Taleb , Emmanuel Hyon

Lower Bound On the Computational Complexity of Discounted Markov Decision Problems

We study the computational complexity of the infinite-horizon discounted-reward Markov Decision Problem (MDP) with a finite state space $|\mathcal{S}|$ and a finite action space $|\mathcal{A}|$. We show that any randomized algorithm needs a…

Computational Complexity · Computer Science 2017-05-24 Yichen Chen , Mengdi Wang

On the Convergence of Optimal Actions for Markov Decision Processes and the Optimality of $(s,S)$ Inventory Policies

This paper studies convergence properties of optimal values and actions for discounted and average-cost Markov Decision Processes (MDPs) with weakly continuous transition probabilities and applies these properties to the stochastic…

Optimization and Control · Mathematics 2017-03-21 Eugene A. Feinberg , Mark E. Lewis

Risk-Averse $\omega$-regular Markov Decision Process Control

Many control problems in environments that can be modeled as Markov decision processes (MDPs) concern infinite-time horizon specifications. The classical aim in this context is to compute a control policy that maximizes the probability of…

Systems and Control · Computer Science 2017-05-03 Ruediger Ehlers , Salar Moarref , Ufuk Topcu

Linear programming-based solution methods for constrained partially observable Markov decision processes

Constrained partially observable Markov decision processes (CPOMDPs) have been used to model various real-world phenomena. However, they are notoriously difficult to solve to optimality, and there exist only a few approximation methods for…

Artificial Intelligence · Computer Science 2023-06-27 Robert K. Helmeczi , Can Kavaklioglu , Mucahit Cevik

Mean-field Markov decision processes with common noise and open-loop controls

We develop an exhaustive study of Markov decision process (MDP) under mean field interaction both on states and actions in the presence of common noise, and when optimization is performed over open-loop controls on infinite horizon. Such…

Optimization and Control · Mathematics 2021-09-10 Médéric Motte , Huyên Pham

Large-Scale Markov Decision Problems via the Linear Programming Dual

We consider the problem of controlling a fully specified Markov decision process (MDP), also known as the planning problem, when the state space is very large and calculating the optimal policy is intractable. Instead, we pursue the more…

Optimization and Control · Mathematics 2019-01-09 Yasin Abbasi-Yadkori , Peter L. Bartlett , Xi Chen , Alan Malek

Mean Field Markov Decision Processes

We consider mean-field control problems in discrete time with discounted reward, infinite time horizon and compact state and action space. The existence of optimal policies is shown and the limiting mean-field problem is derived when the…

Optimization and Control · Mathematics 2025-10-16 Nicole Bäuerle

Approximate Constrained Discounted Dynamic Programming with Uniform Feasibility and Optimality

We consider a dynamic programming (DP) approach to approximately solving an infinite-horizon constrained Markov decision process (CMDP) problem with a fixed initial-state for the expected total discounted-reward criterion with a…

Optimization and Control · Mathematics 2023-08-08 Hyeong Soo Chang

Entropy Maximization for Markov Decision Processes Under Temporal Logic Constraints

We study the problem of synthesizing a policy that maximizes the entropy of a Markov decision process (MDP) subject to a temporal logic constraint. Such a policy minimizes the predictability of the paths it generates, or dually, maximizes…

Optimization and Control · Mathematics 2019-06-17 Yagiz Savas , Melkior Ornik , Murat Cubuktepe , Mustafa O. Karabag , Ufuk Topcu