English
Related papers

Related papers: An Adaptive State Aggregation Algorithm for Markov…

200 papers

This paper presents a way of solving Markov Decision Processes that combines state abstraction and temporal abstraction. Specifically, we combine state aggregation with the options framework and demonstrate that they work well together and…

Artificial Intelligence · Computer Science 2015-01-19 Kamil Ciosek , David Silver

In this paper we provide faster algorithms for approximately solving discounted Markov Decision Processes in multiple parameter regimes. Given a discounted Markov Decision Process (DMDP) with $|S|$ states, $|A|$ actions, discount factor…

Data Structures and Algorithms · Computer Science 2020-12-24 Aaron Sidford , Mengdi Wang , Xian Wu , Yinyu Ye

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding…

Artificial Intelligence · Computer Science 2011-06-02 N. L. Zhang , W. Zhang

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Mean payoff (or long-run average reward) provides a mathematically elegant formalism to express performance related…

Performance · Computer Science 2017-09-08 Jan Křetínský , Tobias Meggendorfer

Value iteration is a commonly used and empirically competitive method in solving many Markov decision process problems. However, it is known that value iteration has only pseudo-polynomial complexity in general. We establish a somewhat…

Artificial Intelligence · Computer Science 2013-01-07 Omid Madani

Value iteration is a fundamental algorithm for solving Markov Decision Processes (MDPs). It computes the maximal $n$-step payoff by iterating $n$ times a recurrence equation which is naturally associated to the MDP. At the same time, value…

Formal Languages and Automata Theory · Computer Science 2019-04-30 Nikhil Balaji , Stefan Kiefer , Petr Novotný , Guillermo A. Pérez , Mahsa Shirmohammadi

Value Iteration is a widely used algorithm for solving Markov Decision Processes (MDPs). While previous studies have extensively analyzed its convergence properties, they primarily focus on convergence with respect to the infinity norm. In…

Machine Learning · Computer Science 2025-02-06 Arsenii Mustafin , Sebastien Colla , Alex Olshevsky , Ioannis Ch. Paschalidis

We build on a recently introduced geometric interpretation of Markov Decision Processes (MDPs) to analyze classical MDP-solving algorithms: Value Iteration (VI) and Policy Iteration (PI). First, we develop a geometry-based analytical…

Machine Learning · Computer Science 2025-03-07 Arsenii Mustafin , Aleksei Pakharev , Alex Olshevsky , Ioannis Ch. Paschalidis

We consider deterministic Markov decision processes (MDPs) and apply max-plus algebra tools to approximate the value iteration algorithm by a smaller-dimensional iteration based on a representation on dictionaries of value functions. The…

Machine Learning · Computer Science 2019-06-21 Francis Bach

In this paper we consider the problem of computing an $\epsilon$-optimal policy of a discounted Markov Decision Process (DMDP) provided we can only access its transition function through a generative sampling model that given any…

Optimization and Control · Mathematics 2019-06-07 Aaron Sidford , Mengdi Wang , Xian Wu , Lin F. Yang , Yinyu Ye

Markov decision processes (MDPs) are used to model stochastic systems in many applications. Several efficient algorithms to compute optimal policies have been studied in the literature, including value iteration (VI) and policy iteration.…

Optimization and Control · Mathematics 2021-08-30 Vineet Goyal , Julien Grand-Clement

We study the general approach to accelerating the convergence of the most widely used solution method of Markov decision processes with the total expected discounted reward. Inspired by the monotone behavior of the contraction mappings in…

Optimization and Control · Mathematics 2008-03-28 Oleksandr Shlakhter , Chi-Guhn Lee , Dmitry Khmelev , Nasser Jaber

We consider large-scale Markov decision processes (MDPs) with a risk measure of variability in cost, under the risk-aware MDPs paradigm. Previous studies showed that risk-aware MDPs, based on a minimax approach to handling risk, can be…

Systems and Control · Computer Science 2017-05-17 Pengqian Yu , William B. Haskell , Huan Xu

We present a method for solving implicit (factored) Markov decision processes (MDPs) with very large state spaces. We introduce a property of state space partitions which we call epsilon-homogeneity. Intuitively, an epsilon-homogeneous…

Artificial Intelligence · Computer Science 2013-02-08 Thomas L. Dean , Robert Givan , Sonia Leach

Value iteration is a fixed point iteration technique utilized to obtain the optimal value function and policy in a discounted reward Markov Decision Process (MDP). Here, a contraction operator is constructed and applied repeatedly to arrive…

Machine Learning · Computer Science 2021-09-21 Chandramouli Kamanchi , Raghuram Bharadwaj Diddigi , Shalabh Bhatnagar

In this paper, we consider the finite-state approximation of a discrete-time constrained Markov decision process (MDP) under the discounted and average cost criteria. Using the linear programming formulation of the constrained discounted…

Optimization and Control · Mathematics 2018-07-10 Naci Saldi

In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solution of factored Markov decision processes (fMDPs). The traditional approximate value iteration algorithm is modified in two ways. For one,…

Artificial Intelligence · Computer Science 2008-08-13 Istvan Szita , Andras Lorincz

We consider a class of optimization problems over stochastic variables where the algorithm can learn information about the value of any variable through a series of costly steps; we model this information acquisition process as a Markov…

Data Structures and Algorithms · Computer Science 2025-07-25 Shuchi Chawla , Dimitris Christou , Amit Harlev , Ziv Scully

We present a technique for speeding up the convergence of value iteration for partially observable Markov decisions processes (POMDPs). The underlying idea is similar to that behind modified policy iteration for fully observable Markov…

Artificial Intelligence · Computer Science 2013-01-30 Nevin Lianwen Zhang , Stephen S. Lee , Weihong Zhang

Recently discovered polyhedral structures of the value function for finite state-action discounted Markov decision processes (MDP) shed light on understanding the success of reinforcement learning. We investigate the value function polytope…

Machine Learning · Computer Science 2022-06-27 Yue Wu , Jesús A. De Loera
‹ Prev 1 2 3 10 Next ›