Related papers: Efficient Planning in Large MDPs with Weak Linear …

Large-Scale Markov Decision Problems via the Linear Programming Dual

We consider the problem of controlling a fully specified Markov decision process (MDP), also known as the planning problem, when the state space is very large and calculating the optimal policy is intractable. Instead, we pursue the more…

Optimization and Control · Mathematics 2019-01-09 Yasin Abbasi-Yadkori , Peter L. Bartlett , Xi Chen , Alan Malek

Efficient Solution Algorithms for Factored MDPs

This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (MDPs). Factored MDPs represent a complex state space using state variables and the transition model using a dynamic Bayesian network. This…

Artificial Intelligence · Computer Science 2011-06-10 C. Guestrin , D. Koller , R. Parr , S. Venkataraman

Partitioned Linear Programming Approximations for MDPs

Approximate linear programming (ALP) is an efficient approach to solving large factored Markov decision processes (MDPs). The main idea of the method is to approximate the optimal value function by a set of basis functions and optimize…

Artificial Intelligence · Computer Science 2012-06-18 Branislav Kveton , Milos Hauskrecht

Linear Programming for Large-Scale Markov Decision Problems

We consider the problem of controlling a Markov decision process (MDP) with a large state space, so as to minimize average cost. Since it is intractable to compete with the optimal policy for large scale problems, we pursue the more modest…

Optimization and Control · Mathematics 2014-02-28 Yasin Abbasi-Yadkori , Peter L. Bartlett , Alan Malek

Sample and Oracle Efficient Reinforcement Learning for MDPs with Linearly-Realizable Value Functions

Designing sample-efficient and computationally feasible reinforcement learning (RL) algorithms is particularly challenging in environments with large or infinite state and action spaces. In this paper, we advance this effort by presenting…

Machine Learning · Computer Science 2024-10-04 Zakaria Mhammedi

Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes

Long-run average optimization problems for Markov decision processes (MDPs) require constructing policies with optimal steady-state behavior, i.e., optimal limit frequency of visits to the states. However, such policies may suffer from…

Multiagent Systems · Computer Science 2023-12-20 David Klaška , Antonín Kučera , Vojtěch Kůr , Vít Musil , Vojtěch Řehák

Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes

In this work, we consider a cooperative multi-agent Markov decision process (MDP) involving m agents. At each decision epoch, all the m agents independently select actions in order to maximize a common long-term objective. In the policy…

Machine Learning · Computer Science 2024-05-01 Lakshmi Mandal , Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

Asymptotic Optimality of Finite Approximations to Markov Decision Processes with Borel Spaces

Calculating optimal policies is known to be computationally difficult for Markov decision processes (MDPs) with Borel state and action spaces. This paper studies finite-state approximations of discrete time Markov decision processes with…

Optimization and Control · Mathematics 2016-09-23 Naci Saldi , Serdar Yüksel , Tamás Linder

Approximate dynamic programming with $(\min,+)$ linear function approximation for Markov decision processes

Markov Decision Processes (MDP) is an useful framework to cast optimal sequential decision making problems. Given any MDP the aim is to find the optimal action selection mechanism i.e., the optimal policy. Typically, the optimal policy…

Systems and Control · Computer Science 2014-03-18 Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

Value-Function Approximations for Partially Observable Markov Decision Processes

Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in stochastic domains in which states of the system are observable only indirectly, via a…

Artificial Intelligence · Computer Science 2011-06-02 M. Hauskrecht

On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function

We consider local planning in fixed-horizon MDPs with a generative model under the assumption that the optimal value function lies close to the span of a feature map. The generative model provides a local access to the MDP: The planner can…

Machine Learning · Computer Science 2021-07-12 Gellért Weisz , Philip Amortila , Barnabás Janzer , Yasin Abbasi-Yadkori , Nan Jiang , Csaba Szepesvári

Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles

The key assumption underlying linear Markov Decision Processes (MDPs) is that the learner has access to a known feature map $\phi(x, a)$ that maps state-action pairs to $d$-dimensional vectors, and that the rewards and transitions are…

Machine Learning · Computer Science 2023-09-20 Noah Golowich , Ankur Moitra , Dhruv Rohatgi

A Model Approximation Scheme for Planning in Partially Observable Stochastic Domains

Partially observable Markov decision processes (POMDPs) are a natural model for planning problems where effects of actions are nondeterministic and the state of the world is not completely observable. It is difficult to solve POMDPs…

Artificial Intelligence · Computer Science 2009-09-25 N. L. Zhang , W. Liu

Of Cores: A Partial-Exploration Framework for Markov Decision Processes

We introduce a framework for approximate analysis of Markov decision processes (MDP) with bounded-, unbounded-, and infinite-horizon properties. The main idea is to identify a "core" of an MDP, i.e., a subsystem where we provably remain…

Systems and Control · Electrical Eng. & Systems 2023-06-22 Jan Křetínský , Tobias Meggendorfer

Approximate Value Iteration for Risk-aware Markov Decision Processes

We consider large-scale Markov decision processes (MDPs) with a risk measure of variability in cost, under the risk-aware MDPs paradigm. Previous studies showed that risk-aware MDPs, based on a minimax approach to handling risk, can be…

Systems and Control · Computer Science 2017-05-17 Pengqian Yu , William B. Haskell , Huan Xu

On the Complexity of Solving Markov Decision Problems

Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize results regarding the complexity of solving…

Artificial Intelligence · Computer Science 2013-02-21 Michael L. Littman , Thomas L. Dean , Leslie Pack Kaelbling

Finite-Horizon Markov Decision Processes with State Constraints

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (minimize…

Optimization and Control · Mathematics 2015-07-08 Mahmoud El Chamie , Behcet Acikmese

Solving Factored MDPs with Hybrid State and Action Variables

Efficient representations and solutions for large decision problems with continuous and discrete variables are among the most important challenges faced by the designers of automated decision support systems. In this paper, we describe a…

Artificial Intelligence · Computer Science 2011-10-04 C. Guestrin , M. Hauskrecht , B. Kveton

Max-Plus Matching Pursuit for Deterministic Markov Decision Processes

We consider deterministic Markov decision processes (MDPs) and apply max-plus algebra tools to approximate the value iteration algorithm by a smaller-dimensional iteration based on a representation on dictionaries of value functions. The…

Machine Learning · Computer Science 2019-06-21 Francis Bach

Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes

Markov decision processes (MDPs) are the defacto frame-work for sequential decision making in the presence ofstochastic uncertainty. A classical optimization criterion forMDPs is to maximize the expected discounted-sum pay-off, which…

Artificial Intelligence · Computer Science 2020-02-28 Tomas Brazdil , Krishnendu Chatterjee , Petr Novotny , Jiri Vahala