English
Related papers

Related papers: Representation Policy Iteration

200 papers

Recently discovered polyhedral structures of the value function for finite state-action discounted Markov decision processes (MDP) shed light on understanding the success of reinforcement learning. We investigate the value function polytope…

Machine Learning · Computer Science 2022-06-27 Yue Wu , Jesús A. De Loera

Markov decision processes (MDPs) are a well studied framework for solving sequential decision making problems under uncertainty. Exact methods for solving MDPs based on dynamic programming such as policy iteration and value iteration are…

Artificial Intelligence · Computer Science 2015-09-09 Yanping Huang

In this work, we study value function approximation in reinforcement learning (RL) problems with high dimensional state or action spaces via a generalized version of representation policy iteration (RPI). We consider the limitations of…

Machine Learning · Computer Science 2019-01-18 Sephora Madjiheurem , Laura Toni

We provide performance guarantees for a variant of simulation-based policy iteration for controlling Markov decision processes that involves the use of stochastic approximation algorithms along with state-of-the-art techniques that are…

Machine Learning · Computer Science 2022-10-17 Anna Winnicki , R. Srikant

Relational Markov Decision Processes are a useful abstraction for complex reinforcement learning problems and stochastic planning problems. Recent work developed representation schemes and algorithms for planning in such problems using the…

Artificial Intelligence · Computer Science 2012-06-26 Chenggang Wang , Roni Khardon

This paper addresses the problem of model-free reinforcement learning for Robust Markov Decision Process (RMDP) with large state spaces. The goal of the RMDP framework is to find a policy that is robust against the parameter uncertainties…

Machine Learning · Computer Science 2021-02-15 Kishan Panaganti , Dileep Kalathil

Model-free reinforcement learning algorithms combined with value function approximation have recently achieved impressive performance in a variety of application domains. However, the theoretical understanding of such algorithms is limited,…

Machine Learning · Computer Science 2021-02-12 Botao Hao , Nevena Lazic , Yasin Abbasi-Yadkori , Pooria Joulani , Csaba Szepesvari

Many large MDPs can be represented compactly using a dynamic Bayesian network. Although the structure of the value function does not retain the structure of the process, recent work has shown that value functions in factored MDPs can often…

Artificial Intelligence · Computer Science 2013-01-18 Daphne Koller , Ron Parr

We build on a recently introduced geometric interpretation of Markov Decision Processes (MDPs) to analyze classical MDP-solving algorithms: Value Iteration (VI) and Policy Iteration (PI). First, we develop a geometry-based analytical…

Machine Learning · Computer Science 2025-03-07 Arsenii Mustafin , Aleksei Pakharev , Alex Olshevsky , Ioannis Ch. Paschalidis

We study the problem of learning optimal policies in finite-horizon Markov Decision Processes (MDPs) using low-rank reinforcement learning (RL) methods. In finite-horizon MDPs, the policies, and therefore the value functions (VFs) are not…

Machine Learning · Computer Science 2026-05-14 Sergio Rozada , Jose Luis Orejuela , Antonio G. Marques

Given a Markov decision process (MDP), we seek to learn representations for a range of policies to facilitate behavior steering at test time. As policies of an MDP are uniquely determined by their occupancy measures, we propose modeling…

Machine Learning · Computer Science 2026-02-02 Beiming Li , Sergio Rozada , Alejandro Ribeiro

We consider large-scale Markov decision processes (MDPs) with an unknown cost function and employ stochastic convex optimization tools to address the problem of imitation learning, which consists of learning a policy from a finite set of…

Machine Learning · Computer Science 2022-01-04 Angeliki Kamoutsi , Goran Banjac , John Lygeros

We consider approximate dynamic programming in $\gamma$-discounted Markov decision processes and apply it to approximate planning with linear value-function approximation. Our first contribution is a new variant of Approximate Policy…

Machine Learning · Computer Science 2022-10-31 Gellért Weisz , András György , Tadashi Kozuno , Csaba Szepesvári

Decision-making problems in uncertain or stochastic domains are often formulated as Markov decision processes (MDPs). Policy iteration (PI) is a popular algorithm for searching over policy-space, the size of which is exponential in the…

Artificial Intelligence · Computer Science 2013-01-30 Yishay Mansour , Satinder Singh

Reinforcement Learning (RL) encompasses diverse paradigms, including model-based RL, policy-based RL, and value-based RL, each tailored to approximate the model, optimal policy, and optimal value function, respectively. This work…

Machine Learning · Computer Science 2024-12-10 Guhao Feng , Han Zhong

Representations are at the core of all deep reinforcement learning (RL) methods for both Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs). Many representation learning methods and theoretical…

Machine Learning · Computer Science 2024-04-23 Tianwei Ni , Benjamin Eysenbach , Erfan Seyedsalehi , Michel Ma , Clement Gehring , Aditya Mahajan , Pierre-Luc Bacon

In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning…

Machine Learning · Computer Science 2018-08-23 Dimitri P. Bertsekas

We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual value-function learning step with a learning step in policy…

Artificial Intelligence · Computer Science 2011-09-13 A. Fern , R. Givan , S. Yoon

Policy iteration (PI) is a recursive process of policy evaluation and improvement for solving an optimal decision-making/control problem, or in other words, a reinforcement learning (RL) problem. PI has also served as the fundamental for…

Artificial Intelligence · Computer Science 2021-04-06 Jaeyoung Lee , Richard S. Sutton

The ability to learn good representations of states is essential for solving large reinforcement learning problems, where exploration, generalization, and transfer are particularly challenging. The Laplacian representation is a promising…

Machine Learning · Computer Science 2024-04-04 Diego Gomez , Michael Bowling , Marlos C. Machado
‹ Prev 1 2 3 10 Next ›