Related papers: Debiasing Meta-Gradient Reinforcement Learning by …

An Investigation of the Bias-Variance Tradeoff in Meta-Gradients

Meta-gradients provide a general approach for optimizing the meta-parameters of reinforcement learning (RL) algorithms. Estimation of meta-gradients is central to the performance of these meta-algorithms, and has been studied in the setting…

Machine Learning · Computer Science 2022-09-26 Risto Vuorio , Jacob Beck , Shimon Whiteson , Jakob Foerster , Gregory Farquhar

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an…

Machine Learning · Computer Science 2020-07-17 Zhongwen Xu , Hado van Hasselt , Matteo Hessel , Junhyuk Oh , Satinder Singh , David Silver

A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning

Gradient-based Meta-RL (GMRL) refers to methods that maintain two-level optimisation procedures wherein the outer-loop meta-learner guides the inner-loop gradient-based reinforcement learner to achieve fast adaptations. In this paper, we…

Machine Learning · Computer Science 2024-03-26 Xidong Feng , Bo Liu , Jie Ren , Luo Mai , Rui Zhu , Haifeng Zhang , Jun Wang , Yaodong Yang

Meta-Gradient Reinforcement Learning

The goal of reinforcement learning algorithms is to estimate and/or optimise the value function. However, unlike supervised learning, no teacher or oracle is available to provide the true value function. Instead, the majority of…

Machine Learning · Computer Science 2018-05-25 Zhongwen Xu , Hado van Hasselt , David Silver

Offline Meta-Reinforcement Learning with Advantage Weighting

This paper introduces the offline meta-reinforcement learning (offline meta-RL) problem setting and proposes an algorithm that performs well in this setting. Offline meta-RL is analogous to the widely successful supervised learning strategy…

Machine Learning · Computer Science 2021-07-22 Eric Mitchell , Rafael Rafailov , Xue Bin Peng , Sergey Levine , Chelsea Finn

Model Based Meta Learning of Critics for Policy Gradients

Being able to seamlessly generalize across different tasks is fundamental for robots to act in our world. However, learning representations that generalize quickly to new scenarios is still an open research problem in reinforcement…

Machine Learning · Computer Science 2022-04-06 Sarah Bechtle , Ludovic Righetti , Franziska Meier

Variational Meta Reinforcement Learning for Social Robotics

With the increasing presence of robots in our every-day environments, improving their social skills is of utmost importance. Nonetheless, social robotics still faces many challenges. One bottleneck is that robotic behaviors need to be often…

Robotics · Computer Science 2023-08-08 Anand Ballou , Xavier Alameda-Pineda , Chris Reinke

Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning

Despite the empirical success of meta reinforcement learning (meta-RL), there are still a number poorly-understood discrepancies between theory and practice. Critically, biased gradient estimates are almost always implemented in practice,…

Machine Learning · Computer Science 2021-12-15 Yunhao Tang

How Should We Meta-Learn Reinforcement Learning Algorithms?

The process of meta-learning algorithms from data, instead of relying on manual design, is growing in popularity as a paradigm for improving the performance of machine learning systems. Meta-learning shows particular promise for…

Machine Learning · Computer Science 2025-09-11 Alexander David Goldie , Zilin Wang , Jaron Cohen , Jakob Nicolaus Foerster , Shimon Whiteson

Offline Meta-Reinforcement Learning with Online Self-Supervision

Meta-reinforcement learning (RL) methods can meta-train policies that adapt to new tasks with orders of magnitude less data than standard RL, but meta-training itself is costly and time-consuming. If we can meta-train on offline data, then…

Machine Learning · Computer Science 2022-07-08 Vitchyr H. Pong , Ashvin Nair , Laura Smith , Catherine Huang , Sergey Levine

Regularizing Meta-Learning via Gradient Dropout

With the growing attention on learning-to-learn new tasks using only a few examples, meta-learning has been widely used in numerous problems such as few-shot classification, reinforcement learning, and domain generalization. However,…

Computer Vision and Pattern Recognition · Computer Science 2020-04-14 Hung-Yu Tseng , Yi-Wen Chen , Yi-Hsuan Tsai , Sifei Liu , Yen-Yu Lin , Ming-Hsuan Yang

On the Role of Discount Factor in Offline Reinforcement Learning

Offline reinforcement learning (RL) enables effective learning from previously collected data without exploration, which shows great promise in real-world applications when exploration is expensive or even infeasible. The discount factor,…

Machine Learning · Computer Science 2022-06-16 Hao Hu , Yiqin Yang , Qianchuan Zhao , Chongjie Zhang

Learning Rewards, Not Labels: Adversarial Inverse Reinforcement Learning for Machinery Fault Detection

Reinforcement learning (RL) offers significant promise for machinery fault detection (MFD). However, most existing RL-based MFD approaches do not fully exploit RL's sequential decision-making strengths, often treating MFD as a simple…

Machine Learning · Computer Science 2026-02-27 Dhiraj Neupane , Richard Dazeley , Mohamed Reda Bouadjenek , Sunil Aryal

Meta-Reinforcement Learning for Adaptive Control of Second Order Systems

Meta-learning is a branch of machine learning which aims to synthesize data from a distribution of related tasks to efficiently solve new ones. In process control, many systems have similar and well-understood dynamics, which suggests it is…

Machine Learning · Computer Science 2022-09-21 Daniel G. McClement , Nathan P. Lawrence , Michael G. Forbes , Philip D. Loewen , Johan U. Backström , R. Bhushan Gopaluni

Meta-Reinforcement Learning with Universal Policy Adaptation: Provable Near-Optimality under All-task Optimum Comparator

Meta-reinforcement learning (Meta-RL) has attracted attention due to its capability to enhance reinforcement learning (RL) algorithms, in terms of data efficiency and generalizability. In this paper, we develop a bilevel optimization…

Machine Learning · Computer Science 2024-10-15 Siyuan Xu , Minghui Zhu

Taylor Expansion of Discount Factors

In practical reinforcement learning (RL), the discount factor used for estimating value functions often differs from that used for defining the evaluation objective. In this work, we study the effect that this discrepancy of discount…

Machine Learning · Computer Science 2021-06-16 Yunhao Tang , Mark Rowland , Rémi Munos , Michal Valko

Offline Meta-level Model-based Reinforcement Learning Approach for Cold-Start Recommendation

Reinforcement learning (RL) has shown great promise in optimizing long-term user interest in recommender systems. However, existing RL-based recommendation methods need a large number of interactions for each user to learn a robust…

Machine Learning · Computer Science 2020-12-07 Yanan Wang , Yong Ge , Li Li , Rui Chen , Tong Xu

Decoupling Time and Risk: Risk-Sensitive Reinforcement Learning with General Discounting

Distributional reinforcement learning (RL) is a powerful framework increasingly adopted in safety-critical domains for its ability to optimize risk-sensitive objectives. However, the role of the discount factor is often overlooked, as it is…

Machine Learning · Computer Science 2026-02-05 Mehrdad Moghimi , Anthony Coache , Hyejin Ku

Meta-Reinforcement Learning for the Tuning of PI Controllers: An Offline Approach

Meta-learning is a branch of machine learning which trains neural network models to synthesize a wide variety of data in order to rapidly solve new problems. In process control, many systems have similar and well-understood dynamics, which…

Systems and Control · Electrical Eng. & Systems 2022-09-20 Daniel G. McClement , Nathan P. Lawrence , Johan U. Backstrom , Philip D. Loewen , Michael G. Forbes , R. Bhushan Gopaluni

Discovering Reinforcement Learning Algorithms

Reinforcement learning (RL) algorithms update an agent's parameters according to one of several possible rules, discovered manually through years of research. Automating the discovery of update rules from data could lead to more efficient…

Machine Learning · Computer Science 2021-01-06 Junhyuk Oh , Matteo Hessel , Wojciech M. Czarnecki , Zhongwen Xu , Hado van Hasselt , Satinder Singh , David Silver