Related papers: A Gradient-Aware Search Algorithm for Constrained …

Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its utilities/costs. A new primal-dual approach is…

Optimization and Control · Mathematics 2021-10-22 Tianjiao Li , Ziwei Guan , Shaofeng Zou , Tengyu Xu , Yingbin Liang , Guanghui Lan

Finite-Horizon Markov Decision Processes with State Constraints

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (minimize…

Optimization and Control · Mathematics 2015-07-08 Mahmoud El Chamie , Behcet Acikmese

A safe exploration approach to constrained Markov decision processes

We consider discounted infinite-horizon constrained Markov decision processes (CMDPs), where the goal is to find an optimal policy that maximizes the expected cumulative reward while satisfying expected cumulative constraints. Motivated by…

Machine Learning · Computer Science 2025-03-04 Tingting Ni , Maryam Kamgarpour

Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes

Constrained Markov Decision Processes (CMDPs) are one of the common ways to model safe reinforcement learning problems, where constraint functions model the safety objectives. Lagrangian-based dual or primal-dual algorithms provide…

Machine Learning · Computer Science 2023-08-31 Adrian Müller , Pragnya Alatur , Giorgia Ramponi , Niao He

A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization, in which an agent aims to maximize the entropy-regularized value function while satisfying constraints on the expected total…

Machine Learning · Computer Science 2023-04-10 Donghao Ying , Yuhao Ding , Javad Lavaei

Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs

We study the problem of computing deterministic optimal policies for constrained Markov decision processes (MDPs) with continuous state and action spaces, which are widely encountered in constrained dynamical systems. Designing…

Artificial Intelligence · Computer Science 2025-04-07 Sergio Rozada , Dongsheng Ding , Antonio G. Marques , Alejandro Ribeiro

New Penalized Stochastic Gradient Methods for Linearly Constrained Strongly Convex Optimization

For minimizing a strongly convex objective function subject to linear inequality constraints, we consider a penalty approach that allows one to utilize stochastic methods for problems with a large number of constraints and/or objective…

Optimization and Control · Mathematics 2022-02-16 Meng Li , Paul Grigas , Alper Atamturk

A Best-of-Both-Worlds Algorithm for Constrained MDPs with Long-Term Constraints

We study online learning in episodic constrained Markov decision processes (CMDPs), where the learner aims at collecting as much reward as possible over the episodes, while satisfying some long-term constraints during the learning process.…

Machine Learning · Computer Science 2024-08-30 Jacopo Germano , Francesco Emanuele Stradi , Gianmarco Genalti , Matteo Castiglioni , Alberto Marchesi , Nicola Gatti

A two-stage search framework for constrained multi-gradient descent

The multi-gradient descent algorithm (MGDA) finds a common descent direction that can improve all objectives by identifying the minimum-norm point in the convex hull of the objective gradients. This method has become a foundational tool in…

Optimization and Control · Mathematics 2025-04-16 Yuan-Zheng Lei , Yaobang Gong , Xianfeng Terry Yang

Algorithm for Constrained Markov Decision Process with Linear Convergence

The problem of constrained Markov decision process is considered. An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs (the number of constraints is relatively small). A new dual…

Optimization and Control · Mathematics 2022-10-21 Egor Gladin , Maksim Lavrik-Karmazin , Karina Zainullina , Varvara Rudenko , Alexander Gasnikov , Martin Takáč

A Primal-Dual Approach to Constrained Markov Decision Processes

In many operations management problems, we need to make decisions sequentially to minimize the cost while satisfying certain constraints. One modeling approach to study such problems is constrained Markov decision process (CMDP). When…

Optimization and Control · Mathematics 2021-01-27 Yi Chen , Jing Dong , Zhaoran Wang

Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction

We study Concave Constrained Markov Decision Processes (Concave CMDPs) where both the objective and constraints are defined as concave functions of the state-action occupancy measure. We propose the Variance-Reduced Primal-Dual Policy…

Machine Learning · Computer Science 2024-05-28 Donghao Ying , Mengzi Amy Guo , Hyunin Lee , Yuhao Ding , Javad Lavaei , Zuo-Jun Max Shen

Sample-Efficient Constrained Reinforcement Learning with General Parameterization

We consider a constrained Markov Decision Problem (CMDP) where the goal of an agent is to maximize the expected discounted sum of rewards over an infinite horizon while ensuring that the expected discounted sum of costs exceeds a certain…

Machine Learning · Computer Science 2024-11-01 Washim Uddin Mondal , Vaneet Aggarwal

Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm

This paper explores the realm of infinite horizon average reward Constrained Markov Decision Processes (CMDPs). To the best of our knowledge, this work is the first to delve into the regret and constraint violation analysis of average…

Machine Learning · Computer Science 2024-10-31 Qinbo Bai , Washim Uddin Mondal , Vaneet Aggarwal

Linear programming-based solution methods for constrained partially observable Markov decision processes

Constrained partially observable Markov decision processes (CPOMDPs) have been used to model various real-world phenomena. However, they are notoriously difficult to solve to optimality, and there exist only a few approximation methods for…

Artificial Intelligence · Computer Science 2023-06-27 Robert K. Helmeczi , Can Kavaklioglu , Mucahit Cevik

Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm

We consider the problem of constrained Markov decision process (CMDP) in continuous state-actions spaces where the goal is to maximize the expected cumulative reward subject to some constraints. We propose a novel Conservative Natural…

Machine Learning · Computer Science 2024-05-20 Qinbo Bai , Amrit Singh Bedi , Vaneet Aggarwal

Learning Constrained Markov Decision Processes With Non-stationary Rewards and Constraints

In constrained Markov decision processes (CMDPs) with adversarial rewards and constraints, a well-known impossibility result prevents any algorithm from attaining both sublinear regret and sublinear constraint violation, when competing…

Machine Learning · Computer Science 2024-09-27 Francesco Emanuele Stradi , Anna Lunghi , Matteo Castiglioni , Alberto Marchesi , Nicola Gatti

A Sample-Efficient Algorithm for Episodic Finite-Horizon MDP with Constraints

Constrained Markov Decision Processes (CMDPs) formalize sequential decision-making problems whose objective is to minimize a cost function while satisfying constraints on various cost functions. In this paper, we consider the setting of…

Machine Learning · Computer Science 2020-09-25 Krishna C. Kalagarla , Rahul Jain , Pierluigi Nuzzo

Policy Optimization for Constrained MDPs with Provable Fast Global Convergence

We address the problem of finding the optimal policy of a constrained Markov decision process (CMDP) using a gradient descent-based algorithm. Previous results have shown that a primal-dual approach can achieve an $\mathcal{O}(1/\sqrt{T})$…

Machine Learning · Computer Science 2022-02-07 Tao Liu , Ruida Zhou , Dileep Kalathil , P. R. Kumar , Chao Tian

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account \emph{risk}, i.e., increased awareness of events of small probability and high consequences. Accordingly, the…

Artificial Intelligence · Computer Science 2017-04-07 Yinlam Chow , Mohammad Ghavamzadeh , Lucas Janson , Marco Pavone