Related papers: Augmenting Unsupervised Reinforcement Learning wit…

Data-Efficient Reinforcement Learning with Self-Predictive Representations

While deep reinforcement learning excels at solving tasks where large amounts of data can be collected through virtually unlimited interaction with the environment, learning from limited interaction remains a key challenge. We posit that an…

Machine Learning · Computer Science 2021-05-21 Max Schwarzer , Ankesh Anand , Rishab Goel , R Devon Hjelm , Aaron Courville , Philip Bachman

Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies

Reinforcement learning (RL) systems have countless applications, from energy-grid management to protein design. However, such real-world scenarios are often extremely difficult, combinatorial in nature, and require complex coordination…

Machine Learning · Computer Science 2025-12-19 Felix Chalumeau , Daniel Rajaonarivonivelomanantsoa , Ruan de Kock , Claude Formanek , Sasha Abramowitz , Oumayma Mahjoub , Wiem Khlifi , Simon Du Toit , Louay Ben Nessir , Refiloe Shabe , Noah De Nicola , Arnol Fokam , Siddarth Singh , Ulrich Mbou Sob , Arnu Pretorius

Retrieval-Augmented Reinforcement Learning

Most deep reinforcement learning (RL) algorithms distill experience into parametric behavior policies or value functions via gradient updates. While effective, this approach has several disadvantages: (1) it is computationally expensive,…

Machine Learning · Computer Science 2022-05-25 Anirudh Goyal , Abram L. Friesen , Andrea Banino , Theophane Weber , Nan Rosemary Ke , Adria Puigdomenech Badia , Arthur Guez , Mehdi Mirza , Peter C. Humphreys , Ksenia Konyushkova , Laurent Sifre , Michal Valko , Simon Osindero , Timothy Lillicrap , Nicolas Heess , Charles Blundell

Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting

Deep pretrained language models have achieved great success in the way of pretraining first and then fine-tuning. But such a sequential transfer learning paradigm often confronts the catastrophic forgetting problem and leads to sub-optimal…

Computation and Language · Computer Science 2020-04-28 Sanyuan Chen , Yutai Hou , Yiming Cui , Wanxiang Che , Ting Liu , Xiangzhan Yu

On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

Intelligent agents should have the ability to leverage knowledge from previously learned tasks in order to learn new ones quickly and efficiently. Meta-learning approaches have emerged as a popular solution to achieve this. However,…

Machine Learning · Computer Science 2023-02-17 Zhao Mandi , Pieter Abbeel , Stephen James

A Short Survey On Memory Based Reinforcement Learning

Reinforcement learning (RL) is a branch of machine learning which is employed to solve various sequential decision making problems without proper supervision. Due to the recent advancement of deep learning, the newly proposed Deep-RL…

Artificial Intelligence · Computer Science 2019-04-17 Dhruv Ramani

SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning

Preference-based reinforcement learning (RL) has shown potential for teaching agents to perform the target tasks without a costly, pre-defined reward function by learning the reward with a supervisor's preference between the two agent…

Machine Learning · Computer Science 2022-03-21 Jongjin Park , Younggyo Seo , Jinwoo Shin , Honglak Lee , Pieter Abbeel , Kimin Lee

Reinforcement Learning with Unsupervised Auxiliary Tasks

Deep reinforcement learning agents have achieved state-of-the-art results by directly maximising cumulative reward. However, environments contain a much wider variety of possible training signals. In this paper, we introduce an agent that…

Machine Learning · Computer Science 2016-11-17 Max Jaderberg , Volodymyr Mnih , Wojciech Marian Czarnecki , Tom Schaul , Joel Z Leibo , David Silver , Koray Kavukcuoglu

RetroAgent: From Solving to Evolving via Retrospective Dual Intrinsic Feedback

Standard reinforcement learning (RL) for large language model (LLM) agents typically optimizes extrinsic rewards, prioritizing isolated task completion over continual adaptation. Consequently, agents often converge to suboptimal policies…

Artificial Intelligence · Computer Science 2026-03-31 Xiaoying Zhang , Zichen Liu , Yipeng Zhang , Xia Hu , Wenqi Shao

When in Doubt, Think Slow: Iterative Reasoning with Latent Imagination

In an unfamiliar setting, a model-based reinforcement learning agent can be limited by the accuracy of its world model. In this work, we present a novel, training-free approach to improving the performance of such agents separately from…

Machine Learning · Computer Science 2024-02-26 Martin Benfeghoul , Umais Zahid , Qinghai Guo , Zafeirios Fountas

MetaReflection: Learning Instructions for Language Agents using Past Reflections

The popularity of Large Language Models (LLMs) have unleashed a new age ofLanguage Agents for solving a diverse range of tasks. While contemporary frontier LLMs are capable enough to power reasonably good Language agents, the closed-API…

Computation and Language · Computer Science 2024-10-11 Priyanshu Gupta , Shashank Kirtania , Ananya Singha , Sumit Gulwani , Arjun Radhakrishna , Sherry Shi , Gustavo Soares

Skill-Based Reinforcement Learning with Intrinsic Reward Matching

While unsupervised skill discovery has shown promise in autonomously acquiring behavioral primitives, there is still a large methodological disconnect between task-agnostic skill pretraining and downstream, task-aware finetuning. We present…

Machine Learning · Computer Science 2023-05-29 Ademi Adeniji , Amber Xie , Pieter Abbeel

Prioritized Experience-based Reinforcement Learning with Human Guidance for Autonomous Driving

Reinforcement learning (RL) requires skillful definition and remarkable computational efforts to solve optimization and control problems, which could impair its prospect. Introducing human guidance into reinforcement learning is a promising…

Machine Learning · Computer Science 2022-11-30 Jingda Wu , Zhiyu Huang , Wenhui Huang , Chen Lv

Unsupervised Meta-Learning for Reinforcement Learning

Meta-learning algorithms use past experience to learn to quickly solve new tasks. In the context of reinforcement learning, meta-learning algorithms acquire reinforcement learning procedures to solve new problems more efficiently by…

Machine Learning · Computer Science 2020-05-01 Abhishek Gupta , Benjamin Eysenbach , Chelsea Finn , Sergey Levine

Introspective Experience Replay: Look Back When Surprised

In reinforcement learning (RL), experience replay-based sampling techniques play a crucial role in promoting convergence by eliminating spurious correlations. However, widely used methods such as uniform experience replay (UER) and…

Machine Learning · Computer Science 2023-02-07 Ramnath Kumar , Dheeraj Nagaraj

Self Punishment and Reward Backfill for Deep Q-Learning

Reinforcement learning agents learn by encouraging behaviours which maximize their total reward, usually provided by the environment. In many environments, however, the reward is provided after a series of actions rather than each single…

Artificial Intelligence · Computer Science 2022-01-04 Mohammad Reza Bonyadi , Rui Wang , Maryam Ziaei

Self-Supervised Relational Reasoning for Representation Learning

In self-supervised learning, a system is tasked with achieving a surrogate objective by defining alternative targets on a set of unlabeled data. The aim is to build useful representations that can be used in downstream tasks, without costly…

Machine Learning · Computer Science 2020-11-11 Massimiliano Patacchiola , Amos Storkey

Reinforcement Inference: Leveraging Uncertainty for Self-Correcting Language Model Reasoning

Modern large language models (LLMs) are often evaluated and deployed under a one-shot, greedy inference protocol, especially in professional settings that require deterministic behavior. This regime can systematically under-estimate a fixed…

Artificial Intelligence · Computer Science 2026-02-13 Xinhai Sun

Augmented Replay Memory in Reinforcement Learning With Continuous Control

Online reinforcement learning agents are currently able to process an increasing amount of data by converting it into a higher order value functions. This expansion of the information collected from the environment increases the agent's…

Machine Learning · Computer Science 2021-02-04 Mirza Ramicic , Andrea Bonarini

Can We Really Learn One Representation to Optimize All Rewards?

As machine learning has moved towards leveraging large models as priors for downstream tasks, the community has debated the right form of prior for solving reinforcement learning (RL) problems. If one were to try to prefetch as much…

Machine Learning · Computer Science 2026-02-13 Chongyi Zheng , Royina Karegoudra Jayanth , Benjamin Eysenbach