English
Related papers

Related papers: Bellman Error Centering

200 papers

Fast-converging algorithms are a contemporary requirement in reinforcement learning. In the context of linear function approximation, the magnitude of the smallest eigenvalue of the key matrix is a major factor reflecting the convergence…

Machine Learning · Computer Science 2024-11-12 Xingguo Chen , Yu Gong , Shangdong Yang , Wenhao Wang

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance. This was…

Machine Learning · Statistics 2018-02-23 Mark Rowland , Marc G. Bellemare , Will Dabney , Rémi Munos , Yee Whye Teh

Tail-end risk measures such as static conditional value-at-risk (CVaR) are used in safety-critical applications to prevent rare, yet catastrophic events. Unlike risk-neutral objectives, the static CVaR of the return depends on entire…

Machine Learning · Computer Science 2026-02-04 Aneri Muni , Vincent Taboga , Esther Derman , Pierre-Luc Bacon , Erick Delage

In this paper we argue for the fundamental importance of the value distribution: the distribution of the random return received by a reinforcement learning agent. This is in contrast to the common approach to reinforcement learning which…

Machine Learning · Computer Science 2017-07-24 Marc G. Bellemare , Will Dabney , Rémi Munos

Reward shaping has been applied widely to accelerate Reinforcement Learning (RL) agents' training. However, a principled way of designing effective reward shaping functions, especially for complex continuous control problems, remains…

Machine Learning · Computer Science 2026-02-12 Mateo Juliani , Mingxuan Li , Elias Bareinboim

We show that discounted methods for solving continuing reinforcement learning problems can perform significantly better if they center their rewards by subtracting out the rewards' empirical average. The improvement is substantial at…

Machine Learning · Computer Science 2024-10-31 Abhishek Naik , Yi Wan , Manan Tomar , Richard S. Sutton

Reinforcement learning (RL) algorithms assume that users specify tasks by manually writing down a reward function. However, this process can be laborious and demands considerable technical expertise. Can we devise RL algorithms that instead…

Machine Learning · Computer Science 2022-01-03 Benjamin Eysenbach , Sergey Levine , Ruslan Salakhutdinov

Inference-time scaling has recently emerged as a powerful paradigm for improving the reasoning capability of large language models. Among various approaches, Sequential Monte Carlo (SMC) has become a particularly important framework,…

Computation and Language · Computer Science 2026-02-03 Youheng Zhu , Yiping Lu

Most value function learning algorithms in reinforcement learning are based on the mean squared (projected) Bellman error. However, squared errors are known to be sensitive to outliers, both skewing the solution of the objective and…

Machine Learning · Computer Science 2023-04-19 Andrew Patterson , Victor Liao , Martha White

While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the…

Machine Learning · Computer Science 2022-12-29 Tim G. J. Rudner , Vitchyr H. Pong , Rowan McAllister , Yarin Gal , Sergey Levine

This work proposes an efficient batch algorithm for feature selection in reinforcement learning (RL) with theoretical convergence guarantees. To mitigate the estimation bias inherent in conventional regularization schemes, the first…

Machine Learning · Computer Science 2025-09-22 Kyohei Suzuki , Konstantinos Slavakis

Quantifying uncertainty about a policy's long-term performance is important to solve sequential decision-making tasks. We study the problem from a model-based Bayesian reinforcement learning perspective, where the goal is to learn the…

Machine Learning · Computer Science 2024-09-04 Carlos E. Luis , Alessandro G. Bottero , Julia Vinogradska , Felix Berkenkamp , Jan Peters

When faced with a novel scenario, it can be hard to succeed on the first attempt. In these challenging situations, it is important to know how to retry quickly and meaningfully. Retrying behavior can emerge naturally in robots trained on…

Robotics · Computer Science 2024-06-25 Maximilian Du , Alexander Khazatsky , Tobias Gerstenberg , Chelsea Finn

We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes. We demonstrate its effectiveness by presenting simple and unified proofs of convergence for a variety of…

Machine Learning · Computer Science 2020-03-30 Philip Amortila , Doina Precup , Prakash Panangaden , Marc G. Bellemare

We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning. In particular, we focus on characterizing the variance over values induced by a distribution over MDPs. Previous work…

Machine Learning · Computer Science 2023-03-08 Carlos E. Luis , Alessandro G. Bottero , Julia Vinogradska , Felix Berkenkamp , Jan Peters

In order to solve a task using reinforcement learning, it is necessary to first formalise the goal of that task as a reward function. However, for many real-world tasks, it is very difficult to manually specify a reward function that never…

Machine Learning · Computer Science 2024-12-13 Joar Skalse , Lucy Farnik , Sumeet Ramesh Motwani , Erik Jenner , Adam Gleave , Alessandro Abate

Reliable long-horizon value prediction is difficult in offline reinforcement learning because fitted value methods combine bootstrapping, function approximation, and distribution shift, while standard guarantees often require Bellman…

Machine Learning · Statistics 2026-05-11 Lars van der Laan , Nathan Kallus

This paper investigates the so-called reward-balancing methods, a novel class of algorithms for solving discounted-return reinforcement learning (RL) problems. These methods consist of iteratively adjusting the reward function to transform…

Optimization and Control · Mathematics 2026-04-23 Simone Baroncini , Bahman Gharesifard , Giuseppe Notarstefano

Our goal is for AI systems to correctly identify and act according to their human user's objectives. Cooperative Inverse Reinforcement Learning (CIRL) formalizes this value alignment problem as a two-player game between a human and robot,…

Artificial Intelligence · Computer Science 2018-06-12 Dhruv Malik , Malayandi Palaniappan , Jaime F. Fisac , Dylan Hadfield-Menell , Stuart Russell , Anca D. Dragan

We propose a novel algorithmic framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions. We derive several new algorithms for dynamic programming and…

‹ Prev 1 2 3 10 Next ›