English
Related papers

Related papers: Algorithm for Constrained Markov Decision Process …

200 papers

The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its utilities/costs. A new primal-dual approach is…

Optimization and Control · Mathematics 2021-10-22 Tianjiao Li , Ziwei Guan , Shaofeng Zou , Tengyu Xu , Yingbin Liang , Guanghui Lan

We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization, in which an agent aims to maximize the entropy-regularized value function while satisfying constraints on the expected total…

Machine Learning · Computer Science 2023-04-10 Donghao Ying , Yuhao Ding , Javad Lavaei

Entropy regularized Markov decision processes have been widely used in reinforcement learning. This paper is concerned with the primal-dual formulation of the entropy regularized problems. Standard first-order methods suffer from slow…

Optimization and Control · Mathematics 2023-06-13 Haoya Li , Hsiang-fu Yu , Lexing Ying , Inderjit Dhillon

In many operations management problems, we need to make decisions sequentially to minimize the cost while satisfying certain constraints. One modeling approach to study such problems is constrained Markov decision process (CMDP). When…

Optimization and Control · Mathematics 2021-01-27 Yi Chen , Jing Dong , Zhaoran Wang

Previous work has separately addressed different forms of action, state and action-state entropy regularization, pure exploration and space occupation. These problems have become extremely relevant for regularization, generalization,…

Machine Learning · Computer Science 2023-02-03 Dmytro Grytskyy , Jorge Ramírez-Ruiz , Rubén Moreno-Bote

We propose a novel randomized linear programming algorithm for approximating the optimal policy of the discounted Markov decision problem. By leveraging the value-policy duality and binary-tree data structures, the algorithm adaptively…

Optimization and Control · Mathematics 2019-06-04 Mengdi Wang

A constrained Markov decision process (CMDP) approach is developed for response-adaptive procedures in clinical trials with binary outcomes. The resulting CMDP class of Bayesian response -- adaptive procedures can be used to target a…

Methodology · Statistics 2024-01-31 Stef Baas , Aleida Braaksma , Richard J. Boucherie

Constrained decision-making is essential for designing safe policies in real-world control systems, yet simulated environments often fail to capture real-world adversities. We consider the problem of learning a policy that will maximize the…

Machine Learning · Computer Science 2026-02-10 Sourav Ganguly , Kishan Panaganti , Arnob Ghosh , Adam Wierman

We consider the problem of computing optimal policies in average-reward Markov decision processes. This classical problem can be formulated as a linear program directly amenable to saddle-point optimization methods, albeit with a number of…

Optimization and Control · Mathematics 2020-01-13 Joan Bas-Serrano , Gergely Neu

We consider a constrained Markov Decision Problem (CMDP) where the goal of an agent is to maximize the expected discounted sum of rewards over an infinite horizon while ensuring that the expected discounted sum of costs exceeds a certain…

Machine Learning · Computer Science 2024-11-01 Washim Uddin Mondal , Vaneet Aggarwal

This note summarizes the optimization formulations used in the study of Markov decision processes. We consider both the discounted and undiscounted processes under the standard and the entropy-regularized settings. For each setting, we…

Optimization and Control · Mathematics 2020-12-18 Lexing Ying , Yuhua Zhu

We introduce and study constrained Markov Decision Processes (cMDPs) with anytime constraints. An anytime constraint requires the agent to never violate its budget at any point in time, almost surely. Although Markovian policies are no…

Machine Learning · Computer Science 2024-06-14 Jeremy McMahan , Xiaojin Zhu

The paper studies a distributed constrained optimization problem, where multiple agents connected in a network collectively minimize the sum of individual objective functions subject to a global constraint being an intersection of the local…

Optimization and Control · Mathematics 2016-03-08 Jinlong Lei , Han-Fu Chen , Hai-Tao Fang

We consider the problem of designing policies for Markov decision processes (MDPs) with dynamic coherent risk objectives and constraints. We begin by formulating the problem in a Lagrangian framework. Under the assumption that the risk…

Artificial Intelligence · Computer Science 2021-03-30 Mohamadreza Ahmadi , Ugo Rosolia , Michel D. Ingham , Richard M. Murray , Aaron D. Ames

We consider the constrained optimal control problem for the gradual-impulsive CTMDP model with the performance criteria being the expected total undiscounted costs (from the running cost and the cost from each time an impulse being…

Optimization and Control · Mathematics 2022-04-07 Alexey Piunovskiy , Yi Zhang

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (minimize…

Optimization and Control · Mathematics 2015-07-08 Mahmoud El Chamie , Behcet Acikmese

We consider a discounted cost constrained Markov decision process (CMDP) policy optimization problem, in which an agent seeks to maximize a discounted cumulative reward subject to a number of constraints on discounted cumulative utilities.…

Optimization and Control · Mathematics 2024-11-21 Sihan Zeng , Thinh T. Doan , Justin Romberg

This work proposes an accelerated primal-dual dynamical system for affine constrained convex optimization and presents a class of primal-dual methods with nonergodic convergence rates. In continuous level, exponential decay of a novel…

Optimization and Control · Mathematics 2022-04-12 Hao Luo

We study the general approach to accelerating the convergence of the most widely used solution method of Markov decision processes with the total expected discounted reward. Inspired by the monotone behavior of the contraction mappings in…

Optimization and Control · Mathematics 2008-03-28 Oleksandr Shlakhter , Chi-Guhn Lee , Dmitry Khmelev , Nasser Jaber

Reinforcement learning is widely used in applications where one needs to perform sequential decisions while interacting with the environment. The problem becomes more challenging when the decision requirement includes satisfying some safety…

Machine Learning · Computer Science 2022-07-15 Qinbo Bai , Amrit Singh Bedi , Mridul Agarwal , Alec Koppel , Vaneet Aggarwal
‹ Prev 1 2 3 10 Next ›