English
Related papers

Related papers: Anytime-Constrained Reinforcement Learning

200 papers

Constrained decision-making is essential for designing safe policies in real-world control systems, yet simulated environments often fail to capture real-world adversities. We consider the problem of learning a policy that will maximize the…

Machine Learning · Computer Science 2026-02-10 Sourav Ganguly , Kishan Panaganti , Arnob Ghosh , Adam Wierman

This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be…

Probability · Mathematics 2012-01-04 Xianping Guo , Xinyuan Song

We study the computational complexity of approximating general constrained Markov decision processes. Our primary contribution is the design of a polynomial time $(0,\epsilon)$-additive bicriteria approximation algorithm for finding optimal…

Data Structures and Algorithms · Computer Science 2025-02-12 Jeremy McMahan

In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take…

Optimization and Control · Mathematics 2019-12-09 Ather Gattami

We consider synthesis of control policies that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments. We model the interaction between the system and its environment as a Markov…

Systems and Control · Computer Science 2014-05-01 Jie Fu , Ufuk Topcu

In this paper, we consider the finite-state approximation of a discrete-time constrained Markov decision process (MDP) under the discounted and average cost criteria. Using the linear programming formulation of the constrained discounted…

Optimization and Control · Mathematics 2018-07-10 Naci Saldi

Markov decision processes (MDPs) are the defacto frame-work for sequential decision making in the presence ofstochastic uncertainty. A classical optimization criterion forMDPs is to maximize the expected discounted-sum pay-off, which…

Artificial Intelligence · Computer Science 2020-02-28 Tomas Brazdil , Krishnendu Chatterjee , Petr Novotny , Jiri Vahala

This paper studies the problem of Anytime-Competitive Markov Decision Process (A-CMDP). Existing works on Constrained Markov Decision Processes (CMDPs) aim to optimize the expected reward while constraining the expected cost over random…

Machine Learning · Computer Science 2024-02-06 Jianyi Yang , Pengfei Li , Tongxin Li , Adam Wierman , Shaolei Ren

We study the problem of learning policies that maximize cumulative reward while satisfying safety constraints, even when the real environment differs from a simulator or nominal model. We focus on robust constrained Markov decision…

Machine Learning · Computer Science 2025-11-12 Sourav Ganguly , Arnob Ghosh

Constrained Markov decision processes (CMDPs) are used as a decision-making framework to study the long-run performance of a stochastic system. It is well-known that a stationary optimal policy of a CMDP problem under discounted cost…

Optimization and Control · Mathematics 2025-06-02 V Varagapriya , Vikas Vikram Singh , Abdel Lisser

The fixed-horizon constrained Markov Decision Process (C-MDP) is a well-known model for planning in stochastic environments under operating constraints. Chance-Constrained MDP (CC-MDP) is a variant that allows bounding the probability of…

Artificial Intelligence · Computer Science 2023-04-19 Majid Khonji

We study the synthesis of a policy in a Markov decision process (MDP) following which an agent reaches a target state in the MDP while minimizing its total discounted cost. The problem combines a reachability criterion with a discounted…

Optimization and Control · Mathematics 2021-03-18 Yagiz Savas , Christos K. Verginis , Michael Hibbard , Ufuk Topcu

The ability to compute reward-optimal policies for given and known finite Markov decision processes (MDPs) underpins a variety of applications across planning, controller synthesis, and verification. However, we often want policies (1) to…

Logic in Computer Science · Computer Science 2025-11-18 Linus Heck , Filip Macák , Milan Češka , Sebastian Junges

Practical reinforcement learning problems are often formulated as constrained Markov decision process (CMDP) problems, in which the agent has to maximize the expected return while satisfying a set of prescribed safety constraints. In this…

Machine Learning · Computer Science 2019-09-23 Shin-ichi Maeda , Hayato Watahiki , Shintarou Okada , Masanori Koyama

This paper studies the computation of robust deterministic policies for Markov Decision Processes (MDPs) in the Lightning Does Not Strike Twice (LDST) model of Mannor, Mebel and Xu (ICML '12). In this model, designed to provide robustness…

Optimization and Control · Mathematics 2024-12-18 Fei Wu , Erik Demeulemeester , Jannik Matuschke

We consider a dynamic programming (DP) approach to approximately solving an infinite-horizon constrained Markov decision process (CMDP) problem with a fixed initial-state for the expected total discounted-reward criterion with a…

Optimization and Control · Mathematics 2023-08-08 Hyeong Soo Chang

We study the computational complexity of the infinite-horizon discounted-reward Markov Decision Problem (MDP) with a finite state space $|\mathcal{S}|$ and a finite action space $|\mathcal{A}|$. We show that any randomized algorithm needs a…

Computational Complexity · Computer Science 2017-05-24 Yichen Chen , Mengdi Wang

We consider the reinforcement learning problem for the constrained Markov decision process (CMDP), which plays a central role in satisfying safety or resource constraints in sequential learning and decision-making. In this problem, we are…

Machine Learning · Computer Science 2025-11-19 Jiashuo Jiang , Yinyu Ye

We study the problem of synthesizing a policy that maximizes the entropy of a Markov decision process (MDP) subject to a temporal logic constraint. Such a policy minimizes the predictability of the paths it generates, or dually, maximizes…

Optimization and Control · Mathematics 2019-06-17 Yagiz Savas , Melkior Ornik , Murat Cubuktepe , Mustafa O. Karabag , Ufuk Topcu

We study the policy testing problem in discounted Markov decision processes (MDPs) in the fixed-confidence setting under a generative model with static sampling. The goal is to decide whether the value of a given policy exceeds a specified…

Machine Learning · Statistics 2026-04-21 Kaito Ariu , Po-An Wang , Alexandre Proutiere , Kenshi Abe
‹ Prev 1 2 3 10 Next ›