Related papers: Recurrent Value Functions

Value function estimation using conditional diffusion models for control

A fairly reliable trend in deep reinforcement learning is that the performance scales with the number of parameters, provided a complimentary scaling in amount of training data. As the appetite for large models increases, it is imperative…

Machine Learning · Computer Science 2023-06-14 Bogdan Mazoure , Walter Talbott , Miguel Angel Bautista , Devon Hjelm , Alexander Toshev , Josh Susskind

Value-Based Reinforcement Learning for Continuous Control Robotic Manipulation in Multi-Task Sparse Reward Settings

Learning continuous control in high-dimensional sparse reward settings, such as robotic manipulation, is a challenging problem due to the number of samples often required to obtain accurate optimal value and policy estimates. While many…

Robotics · Computer Science 2021-07-29 Sreehari Rammohan , Shangqun Yu , Bowen He , Eric Hsiung , Eric Rosen , Stefanie Tellex , George Konidaris

On the continuity and smoothness of the value function in reinforcement learning and optimal control

The value function plays a crucial role as a measure for the cumulative future reward an agent receives in both reinforcement learning and optimal control. It is therefore of interest to study how similar the values of neighboring states…

Systems and Control · Electrical Eng. & Systems 2024-03-22 Hans Harder , Sebastian Peitz

Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

Reinforcement learning (RL) is a powerful machine learning technique that enables an intelligent agent to learn an optimal policy that maximizes the cumulative rewards in sequential decision making. Most of methods in the existing…

Machine Learning · Statistics 2023-01-06 Chengchun Shi , Zhengling Qi , Jianing Wang , Fan Zhou

Deep Exploration via Randomized Value Functions

We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to…

Machine Learning · Statistics 2019-09-25 Ian Osband , Benjamin Van Roy , Daniel Russo , Zheng Wen

Parameter-Based Value Functions

Traditional off-policy actor-critic Reinforcement Learning (RL) algorithms learn value functions of a single target policy. However, when value functions are updated to track the learned policy, they forget potentially useful information…

Machine Learning · Computer Science 2021-08-16 Francesco Faccio , Louis Kirsch , Jürgen Schmidhuber

Prediction and Control in Continual Reinforcement Learning

Temporal difference (TD) learning is often used to update the estimate of the value function which is used by RL agents to extract useful policies. In this paper, we focus on value function estimation in continual reinforcement learning. We…

Machine Learning · Computer Science 2023-12-20 Nishanth Anand , Doina Precup

Locally Constrained Representations in Reinforcement Learning

The success of Reinforcement Learning (RL) heavily relies on the ability to learn robust representations from the observations of the environment. In most cases, the representations learned purely by the reinforcement learning loss can…

Machine Learning · Computer Science 2024-02-12 Somjit Nath , Rushiv Arora , Samira Ebrahimi Kahou

Deep Radial-Basis Value Functions for Continuous Control

A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned value function. This operation is often challenging when the learned value function takes continuous actions as input. We…

Machine Learning · Computer Science 2021-03-16 Kavosh Asadi , Neev Parikh , Ronald E. Parr , George D. Konidaris , Michael L. Littman

Reinforcement Learning by Value Gradients

The concept of the value-gradient is introduced and developed in the context of reinforcement learning. It is shown that by learning the value-gradients exploration or stochastic behaviour is no longer needed to find locally optimal…

Neural and Evolutionary Computing · Computer Science 2008-03-26 Michael Fairbank

Experience Replay Using Transition Sequences

Experience replay is one of the most commonly used approaches to improve the sample efficiency of reinforcement learning algorithms. In this work, we propose an approach to select and replay sequences of transitions in order to accelerate…

Artificial Intelligence · Computer Science 2022-09-29 Thommen George Karimpanal , Roland Bouffanais

Learning State Representations from Random Deep Action-conditional Predictions

Our main contribution in this work is an empirical finding that random General Value Functions (GVFs), i.e., deep action-conditional predictions -- random both in what feature of observations they predict as well as in the sequence of…

Machine Learning · Computer Science 2021-11-09 Zeyu Zheng , Vivek Veeriah , Risto Vuorio , Richard Lewis , Satinder Singh

Symbolic Regression Methods for Reinforcement Learning

Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value…

Machine Learning · Computer Science 2021-11-16 Jiří Kubalík , Erik Derner , Jan Žegklitz , Robert Babuška

A Geometric Perspective on Optimal Representations for Reinforcement Learning

We propose a new perspective on representation learning in reinforcement learning based on geometric properties of the space of value functions. We leverage this perspective to provide formal evidence regarding the usefulness of value…

Machine Learning · Computer Science 2019-06-27 Marc G. Bellemare , Will Dabney , Robert Dadashi , Adrien Ali Taiga , Pablo Samuel Castro , Nicolas Le Roux , Dale Schuurmans , Tor Lattimore , Clare Lyle

General Value Function Networks

State construction is important for learning in partially observable environments. A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using…

Machine Learning · Computer Science 2021-02-03 Matthew Schlegel , Andrew Jacobsen , Zaheer Abbas , Andrew Patterson , Adam White , Martha White

Representation Learning on Graphs: A Reinforcement Learning Application

In this work, we study value function approximation in reinforcement learning (RL) problems with high dimensional state or action spaces via a generalized version of representation policy iteration (RPI). We consider the limitations of…

Machine Learning · Computer Science 2019-01-18 Sephora Madjiheurem , Laura Toni

Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment

Policy gradient methods are appealing in deep reinforcement learning but suffer from high variance of gradient estimate. To reduce the variance, the state value function is applied commonly. However, the effect of the state value function…

Machine Learning · Computer Science 2021-08-06 Jiaming Guo , Rui Zhang , Xishan Zhang , Shaohui Peng , Qi Yi , Zidong Du , Xing Hu , Qi Guo , Yunji Chen

Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions

We propose a reinforcement learning (RL) framework under a broad class of risk objectives, characterized by convex scoring functions. This class covers many common risk measures, such as variance, Expected Shortfall, entropic Value-at-Risk,…

Mathematical Finance · Quantitative Finance 2025-05-16 Shanyu Han , Yang Liu , Xiang Yu

World Value Functions: Knowledge Representation for Multitask Reinforcement Learning

An open problem in artificial intelligence is how to learn and represent knowledge that is sufficient for a general agent that needs to solve multiple tasks in a given world. In this work we propose world value functions (WVFs), which are a…

Machine Learning · Computer Science 2022-05-19 Geraud Nangue Tasse , Steven James , Benjamin Rosman

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data is scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and…

Machine Learning · Computer Science 2022-08-17 Yang Yue , Bingyi Kang , Zhongwen Xu , Gao Huang , Shuicheng Yan