Related papers: Topological Value Iteration Algorithms

An Accelerated Fitted Value Iteration Algorithm for MDPs with Finite and Vector-Valued Action Space

This paper studies an accelerated fitted value iteration (FVI) algorithm to solve high-dimensional Markov decision processes (MDPs). FVI is an approximate dynamic programming algorithm that has desirable theoretical properties. However, it…

Optimization and Control · Mathematics 2020-11-30 Sixiang Zhao , William B. Haskell , Michel-Alexandre Cardin

A First-Order Approach To Accelerated Value Iteration

Markov decision processes (MDPs) are used to model stochastic systems in many applications. Several efficient algorithms to compute optimal policies have been studied in the literature, including value iteration (VI) and policy iteration.…

Optimization and Control · Mathematics 2021-08-30 Vineet Goyal , Julien Grand-Clement

An Adaptive State Aggregation Algorithm for Markov Decision Processes

Value iteration is a well-known method of solving Markov Decision Processes (MDPs) that is simple to implement and boasts strong theoretical convergence guarantees. However, the computational cost of value iteration quickly becomes…

Machine Learning · Computer Science 2021-07-26 Guanting Chen , Johann Demetrio Gaebler , Matt Peng , Chunlin Sun , Yinyu Ye

On the Complexity of Value Iteration

Value iteration is a fundamental algorithm for solving Markov Decision Processes (MDPs). It computes the maximal $n$-step payoff by iterating $n$ times a recurrence equation which is naturally associated to the MDP. At the same time, value…

Formal Languages and Automata Theory · Computer Science 2019-04-30 Nikhil Balaji , Stefan Kiefer , Petr Novotný , Guillermo A. Pérez , Mahsa Shirmohammadi

Factored Value Iteration Converges

In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solution of factored Markov decision processes (fMDPs). The traditional approximate value iteration algorithm is modified in two ways. For one,…

Artificial Intelligence · Computer Science 2008-08-13 Istvan Szita , Andras Lorincz

Geometric Re-Analysis of Classical MDP Solving Algorithms

We build on a recently introduced geometric interpretation of Markov Decision Processes (MDPs) to analyze classical MDP-solving algorithms: Value Iteration (VI) and Policy Iteration (PI). First, we develop a geometry-based analytical…

Machine Learning · Computer Science 2025-03-07 Arsenii Mustafin , Aleksei Pakharev , Alex Olshevsky , Ioannis Ch. Paschalidis

Value Iteration for Long-run Average Reward in Markov Decision Processes

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI)…

Systems and Control · Computer Science 2017-09-01 Pranav Ashok , Krishnendu Chatterjee , Przemyslaw Daca , Jan Křetínský , Tobias Meggendorfer

Geometric Policy Iteration for Markov Decision Processes

Recently discovered polyhedral structures of the value function for finite state-action discounted Markov decision processes (MDP) shed light on understanding the success of reinforcement learning. We investigate the value function polytope…

Machine Learning · Computer Science 2022-06-27 Yue Wu , Jesús A. De Loera

Value Iteration with Options and State Aggregation

This paper presents a way of solving Markov Decision Processes that combines state abstraction and temporal abstraction. Specifically, we combine state aggregation with the options framework and demonstrate that they work well together and…

Artificial Intelligence · Computer Science 2015-01-19 Kamil Ciosek , David Silver

Deflated Dynamics Value Iteration

The Value Iteration (VI) algorithm is an iterative procedure to compute the value function of a Markov decision process, and is the basis of many reinforcement learning (RL) algorithms as well. As the error convergence rate of VI as a…

Machine Learning · Computer Science 2025-06-12 Jongmin Lee , Amin Rakhsha , Ernest K. Ryu , Amir-massoud Farahmand

Polynomial Value Iteration Algorithms for Detrerminstic MDPs

Value iteration is a commonly used and empirically competitive method in solving many Markov decision process problems. However, it is known that value iteration has only pseudo-polynomial complexity in general. We establish a somewhat…

Artificial Intelligence · Computer Science 2013-01-07 Omid Madani

Policy Iteration for Relational MDPs

Relational Markov Decision Processes are a useful abstraction for complex reinforcement learning problems and stochastic planning problems. Recent work developed representation schemes and algorithms for planning in such problems using the…

Artificial Intelligence · Computer Science 2012-06-26 Chenggang Wang , Roni Khardon

Sound Value Iteration for Simple Stochastic Games

Algorithmic analysis of Markov decision processes (MDP) and stochastic games (SG) in practice relies on value-iteration (VI) algorithms. Since basic VI does not provide guarantees on the precision of the result, variants of VI have been…

Computer Science and Game Theory · Computer Science 2025-09-18 Muqsit Azeem , Jan Kretinsky , Maximilian Weininger

Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding…

Artificial Intelligence · Computer Science 2011-06-02 N. L. Zhang , W. Zhang

Sound Value Iteration for Simple Stochastic Games

Algorithmic analysis of Markov decision processes (MDP) and stochastic games (SG) in practice relies on value-iteration (VI) algorithms. Since the basic version of VI does not provide guarantees on the precision of the result, variants of…

Computer Science and Game Theory · Computer Science 2026-03-31 Muqsit Azeem , Jan Kretinsky , Maximilian Weininger

Value Iteration with Guessing for Markov Chains and Markov Decision Processes

Two standard models for probabilistic systems are Markov chains (MCs) and Markov decision processes (MDPs). Classic objectives for such probabilistic models for control and planning problems are reachability and stochastic shortest path.…

Artificial Intelligence · Computer Science 2025-05-13 Krishnendu Chatterjee , Mahdi JafariRaviz , Raimundo Saona , Jakub Svoboda

Efficient Strategy Iteration for Mean Payoff in Markov Decision Processes

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Mean payoff (or long-run average reward) provides a mathematically elegant formalism to express performance related…

Performance · Computer Science 2017-09-08 Jan Křetínský , Tobias Meggendorfer

Generalized Second Order Value Iteration in Markov Decision Processes

Value iteration is a fixed point iteration technique utilized to obtain the optimal value function and policy in a discounted reward Markov Decision Process (MDP). Here, a contraction operator is constructed and applied repeatedly to arrive…

Machine Learning · Computer Science 2021-09-21 Chandramouli Kamanchi , Raghuram Bharadwaj Diddigi , Shalabh Bhatnagar

Analysis of Value Iteration Through Absolute Probability Sequences

Value Iteration is a widely used algorithm for solving Markov Decision Processes (MDPs). While previous studies have extensively analyzed its convergence properties, they primarily focus on convergence with respect to the infinity norm. In…

Machine Learning · Computer Science 2025-02-06 Arsenii Mustafin , Sebastien Colla , Alex Olshevsky , Ioannis Ch. Paschalidis

Quantum Algorithms for Finite-horizon Markov Decision Processes

In this work, we design quantum algorithms that are more efficient than classical algorithms to solve time-dependent and finite-horizon Markov Decision Processes (MDPs) in two distinct settings: (1) In the exact dynamics setting, where the…

Quantum Physics · Physics 2025-08-11 Bin Luo , Yuwen Huang , Jonathan Allcock , Xiaojun Lin , Shengyu Zhang , John C. S. Lui