English
Related papers

Related papers: Analyzing Approximate Value Iteration Algorithms

200 papers

Approximate value iteration (AVI) is a family of algorithms for reinforcement learning (RL) that aims to obtain an approximation of the optimal value function. Generally, AVI algorithms implement an iterated procedure where each step…

Machine Learning · Computer Science 2024-03-07 Théo Vincent , Alberto Maria Metelli , Boris Belousov , Jan Peters , Marcello Restelli , Carlo D'Eramo

In this paper, we study the theoretical properties of the projected Bellman equation (PBE) and two algorithms to solve this equation: linear Q-learning and approximate value iteration (AVI). We consider two sufficient conditions for the…

Artificial Intelligence · Computer Science 2025-04-16 Han-Dong Lim , Donghwan Lee

Value iteration (VI) is a ubiquitous algorithm for optimal control, planning, and reinforcement learning schemes. Under the right assumptions, VI is a vital tool to generate inputs with desirable properties for the controlled system, like…

Optimization and Control · Mathematics 2020-11-23 Mathieu Granzotto , Romain Postoyan , Dragan Nešić , Lucian Buşoniu , Jamal Daafouz

This paper studies value iteration for infinite horizon contracting Markov decision processes under convexity assumptions and when the state space is uncountable. The original value iteration is replaced with a more tractable form and the…

Optimization and Control · Mathematics 2018-02-21 Jeremy Yee

Asynchronous stochastic approximations (SAs) are an important class of model-free algorithms, tools and techniques that are popular in multi-agent and distributed control scenarios. To counter Bellman's curse of dimensionality, such…

Optimization and Control · Mathematics 2019-05-03 Arunselvan Ramaswamy , Shalabh Bhatnagar , Daniel E. Quevedo

Two standard models for probabilistic systems are Markov chains (MCs) and Markov decision processes (MDPs). Classic objectives for such probabilistic models for control and planning problems are reachability and stochastic shortest path.…

Artificial Intelligence · Computer Science 2025-05-13 Krishnendu Chatterjee , Mahdi JafariRaviz , Raimundo Saona , Jakub Svoboda

This paper considers variational inequalities (VI) defined by the conditional value-at-risk (CVaR) of uncertain functions and provides three stochastic approximation schemes to solve them. All methods use an empirical estimate of the CVaR…

Optimization and Control · Mathematics 2022-11-16 Jasper Verbree , Ashish Cherukuri

Value Iteration (VI) is foundational to the theory and practice of modern reinforcement learning, and it is known to converge at a $\mathcal{O}(\gamma^k)$-rate, where $\gamma$ is the discount factor. Surprisingly, however, the optimal rate…

Machine Learning · Computer Science 2023-10-31 Jongmin Lee , Ernest K. Ryu

While there is an extensive body of research on the analysis of Value Iteration (VI) for discounted cumulative-reward MDPs, prior work on analyzing VI for (undiscounted) average-reward MDPs has been limited, and most prior results focus on…

Optimization and Control · Mathematics 2026-02-10 Jongmin Lee , Ernest K. Ryu

We propose universal randomized function approximation-based empirical value iteration (EVI) algorithms for Markov decision processes. The `empirical' nature comes from each iteration being done empirically from samples available from…

Optimization and Control · Mathematics 2019-04-25 William B. Haskell , Rahul Jain , Hiteshi Sharma , Pengqian Yu

Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation,…

Artificial Intelligence · Computer Science 2010-06-15 Marek Petrik , Shlomo Zilberstein

The stochastic volatility inspired (SVI) model is widely used to fit the implied variance smile. Presently, most optimizer algorithms for the SVI model have a strong dependence on the input starting point. In this study, we develop an…

Mathematical Finance · Quantitative Finance 2023-01-20 Shuzhen Yang , Wenqing Zhang

Simple stochastic games can be solved by value iteration (VI), which yields a sequence of under-approximations of the value of the game. This sequence is guaranteed to converge to the value only in the limit. Since no stopping criterion is…

Logic in Computer Science · Computer Science 2021-02-02 Edon Kelmendi , Julia Krämer , Jan Kretinsky , Maximilian Weininger

Multi-time-scale stochastic approximation is an iterative algorithm for finding the fixed point of a set of $N$ coupled operators given their noisy samples. It has been observed that due to the coupling between the decision variables and…

Optimization and Control · Mathematics 2024-09-13 Sihan Zeng , Thinh T. Doan

Markov decision processes (MDPs) are used to model stochastic systems in many applications. Several efficient algorithms to compute optimal policies have been studied in the literature, including value iteration (VI) and policy iteration.…

Optimization and Control · Mathematics 2021-08-30 Vineet Goyal , Julien Grand-Clement

In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solution of factored Markov decision processes (fMDPs). The traditional approximate value iteration algorithm is modified in two ways. For one,…

Artificial Intelligence · Computer Science 2008-08-13 Istvan Szita , Andras Lorincz

Approximate dynamic programming algorithms, such as approximate value iteration, have been successfully applied to many complex reinforcement learning tasks, and a better approximate dynamic programming algorithm is expected to further…

Machine Learning · Statistics 2017-10-31 Tadashi Kozuno , Eiji Uchibe , Kenji Doya

We propose empirical dynamic programming algorithms for Markov decision processes (MDPs). In these algorithms, the exact expectation in the Bellman operator in classical value iteration is replaced by an empirical estimate to get `empirical…

Optimization and Control · Mathematics 2013-11-26 William B. Haskell , Rahul Jain , Dileep Kalathil

Adaptive optimal control using value iteration (VI) initiated from a stabilizing policy is theoretically analyzed in various aspects including the continuity of the result, the stability of the system operated using any single/constant…

Systems and Control · Computer Science 2015-05-18 Ali Heydari

We consider a framework for the construction of iterative schemes for operator equations that combine low-rank approximation in tensor formats and adaptive approximation in a basis. Under fairly general assumptions, we obtain a rigorous…

Numerical Analysis · Mathematics 2014-03-17 Markus Bachmayr , Wolfgang Dahmen
‹ Prev 1 2 3 10 Next ›