English
Related papers

Related papers: Recurrent Value Functions

200 papers

A fairly reliable trend in deep reinforcement learning is that the performance scales with the number of parameters, provided a complimentary scaling in amount of training data. As the appetite for large models increases, it is imperative…

Machine Learning · Computer Science 2023-06-14 Bogdan Mazoure , Walter Talbott , Miguel Angel Bautista , Devon Hjelm , Alexander Toshev , Josh Susskind

Learning continuous control in high-dimensional sparse reward settings, such as robotic manipulation, is a challenging problem due to the number of samples often required to obtain accurate optimal value and policy estimates. While many…

Robotics · Computer Science 2021-07-29 Sreehari Rammohan , Shangqun Yu , Bowen He , Eric Hsiung , Eric Rosen , Stefanie Tellex , George Konidaris

The value function plays a crucial role as a measure for the cumulative future reward an agent receives in both reinforcement learning and optimal control. It is therefore of interest to study how similar the values of neighboring states…

Systems and Control · Electrical Eng. & Systems 2024-03-22 Hans Harder , Sebastian Peitz

Reinforcement learning (RL) is a powerful machine learning technique that enables an intelligent agent to learn an optimal policy that maximizes the cumulative rewards in sequential decision making. Most of methods in the existing…

Machine Learning · Statistics 2023-01-06 Chengchun Shi , Zhengling Qi , Jianing Wang , Fan Zhou

We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to…

Machine Learning · Statistics 2019-09-25 Ian Osband , Benjamin Van Roy , Daniel Russo , Zheng Wen

Traditional off-policy actor-critic Reinforcement Learning (RL) algorithms learn value functions of a single target policy. However, when value functions are updated to track the learned policy, they forget potentially useful information…

Machine Learning · Computer Science 2021-08-16 Francesco Faccio , Louis Kirsch , Jürgen Schmidhuber

Temporal difference (TD) learning is often used to update the estimate of the value function which is used by RL agents to extract useful policies. In this paper, we focus on value function estimation in continual reinforcement learning. We…

Machine Learning · Computer Science 2023-12-20 Nishanth Anand , Doina Precup

The success of Reinforcement Learning (RL) heavily relies on the ability to learn robust representations from the observations of the environment. In most cases, the representations learned purely by the reinforcement learning loss can…

Machine Learning · Computer Science 2024-02-12 Somjit Nath , Rushiv Arora , Samira Ebrahimi Kahou

A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned value function. This operation is often challenging when the learned value function takes continuous actions as input. We…

Machine Learning · Computer Science 2021-03-16 Kavosh Asadi , Neev Parikh , Ronald E. Parr , George D. Konidaris , Michael L. Littman

The concept of the value-gradient is introduced and developed in the context of reinforcement learning. It is shown that by learning the value-gradients exploration or stochastic behaviour is no longer needed to find locally optimal…

Neural and Evolutionary Computing · Computer Science 2008-03-26 Michael Fairbank

Experience replay is one of the most commonly used approaches to improve the sample efficiency of reinforcement learning algorithms. In this work, we propose an approach to select and replay sequences of transitions in order to accelerate…

Artificial Intelligence · Computer Science 2022-09-29 Thommen George Karimpanal , Roland Bouffanais

Our main contribution in this work is an empirical finding that random General Value Functions (GVFs), i.e., deep action-conditional predictions -- random both in what feature of observations they predict as well as in the sequence of…

Machine Learning · Computer Science 2021-11-09 Zeyu Zheng , Vivek Veeriah , Risto Vuorio , Richard Lewis , Satinder Singh

Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value…

Machine Learning · Computer Science 2021-11-16 Jiří Kubalík , Erik Derner , Jan Žegklitz , Robert Babuška

We propose a new perspective on representation learning in reinforcement learning based on geometric properties of the space of value functions. We leverage this perspective to provide formal evidence regarding the usefulness of value…

State construction is important for learning in partially observable environments. A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using…

Machine Learning · Computer Science 2021-02-03 Matthew Schlegel , Andrew Jacobsen , Zaheer Abbas , Andrew Patterson , Adam White , Martha White

In this work, we study value function approximation in reinforcement learning (RL) problems with high dimensional state or action spaces via a generalized version of representation policy iteration (RPI). We consider the limitations of…

Machine Learning · Computer Science 2019-01-18 Sephora Madjiheurem , Laura Toni

Policy gradient methods are appealing in deep reinforcement learning but suffer from high variance of gradient estimate. To reduce the variance, the state value function is applied commonly. However, the effect of the state value function…

Machine Learning · Computer Science 2021-08-06 Jiaming Guo , Rui Zhang , Xishan Zhang , Shaohui Peng , Qi Yi , Zidong Du , Xing Hu , Qi Guo , Yunji Chen

We propose a reinforcement learning (RL) framework under a broad class of risk objectives, characterized by convex scoring functions. This class covers many common risk measures, such as variance, Expected Shortfall, entropic Value-at-Risk,…

Mathematical Finance · Quantitative Finance 2025-05-16 Shanyu Han , Yang Liu , Xiang Yu

An open problem in artificial intelligence is how to learn and represent knowledge that is sufficient for a general agent that needs to solve multiple tasks in a given world. In this work we propose world value functions (WVFs), which are a…

Machine Learning · Computer Science 2022-05-19 Geraud Nangue Tasse , Steven James , Benjamin Rosman

Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data is scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and…

Machine Learning · Computer Science 2022-08-17 Yang Yue , Bingyi Kang , Zhongwen Xu , Gao Huang , Shuicheng Yan
‹ Prev 1 2 3 10 Next ›