Related papers: Does Zero-Shot Reinforcement Learning Exist?

Zero-Shot Reinforcement Learning from Low Quality Data

Zero-shot reinforcement learning (RL) promises to provide agents that can perform any task in an environment after an offline, reward-free pre-training phase. Methods leveraging successor measures and successor features have shown strong…

Machine Learning · Computer Science 2024-10-31 Scott Jeen , Tom Bewley , Jonathan M. Cullen

A Unified Framework for Zero-Shot Reinforcement Learning

Zero-shot reinforcement learning (RL) has emerged as a setting for developing general agents, capable of solving downstream tasks without additional training or planning at test-time. While conventional RL optimizes policies for fixed…

Machine Learning · Computer Science 2026-03-10 Jacopo Di Ventura , Jan Felix Kleuker , Aske Plaat , Thomas Moerland

On Zero-Shot Reinforcement Learning

Modern reinforcement learning (RL) systems capture deep truths about general, human problem-solving. In domains where new data can be simulated cheaply, these systems uncover sequential decision-making policies that far exceed the ability…

Machine Learning · Computer Science 2025-10-07 Scott Jeen

Tackling the Zero-Shot Reinforcement Learning Loss Directly

Zero-shot reinforcement learning (RL) methods aim at instantly producing a behavior for an RL task in a given environment, from a description of the reward function. These methods are usually tested by evaluating their average performance…

Machine Learning · Computer Science 2025-02-18 Yann Ollivier

Improving Zero-Shot Offline RL via Behavioral Task Sampling

Offline zero-shot reinforcement learning (RL) aims to learn agents that optimize unseen reward functions without additional environment interaction. The standard approach to this problem trains task-conditioned policies by sampling task…

Artificial Intelligence · Computer Science 2026-04-29 Nazim Bendib , Nicolas Perrin-Gilbert , Olivier Sigaud

Soft Forward-Backward Representations for Zero-shot Reinforcement Learning with General Utilities

Recent advancements in zero-shot reinforcement learning (RL) have facilitated the extraction of diverse behaviors from unlabeled, offline data sources. In particular, forward-backward algorithms (FB) can retrieve a family of policies that…

Machine Learning · Computer Science 2026-02-09 Marco Bagatella , Thomas Rupf , Georg Martius , Andreas Krause

Which Features are Best for Successor Features?

In reinforcement learning, universal successor features (SFs) are a way to provide zero-shot adaptation to new tasks at test time: they provide optimal policies for all downstream reward functions lying in the linear span of a set of base…

Machine Learning · Computer Science 2025-02-18 Yann Ollivier

Proto Successor Measure: Representing the Behavior Space of an RL Agent

Having explored an environment, intelligent agents should be able to transfer their knowledge to most downstream tasks within that environment without additional interactions. Referred to as "zero-shot learning", this ability remains…

Machine Learning · Computer Science 2025-03-12 Siddhant Agarwal , Harshit Sikchi , Peter Stone , Amy Zhang

Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

In many real-world applications, reinforcement learning (RL) agents might have to solve multiple tasks, each one typically modeled via a reward function. If reward functions are expressed linearly, and the agent has previously learned a set…

Machine Learning · Computer Science 2022-06-24 Lucas N. Alegre , Ana L. C. Bazzan , Bruno C. da Silva

Zero-Shot Reinforcement Learning via Function Encoders

Although reinforcement learning (RL) can solve many challenging sequential decision making problems, achieving zero-shot transfer across related tasks remains a challenge. The difficulty lies in finding a good representation for the current…

Machine Learning · Computer Science 2025-03-24 Tyler Ingebrand , Amy Zhang , Ufuk Topcu

Fast Adaptation with Behavioral Foundation Models

Unsupervised zero-shot reinforcement learning (RL) has emerged as a powerful paradigm for pretraining behavioral foundation models (BFMs), enabling agents to solve a wide range of downstream tasks specified via reward functions in a…

Machine Learning · Computer Science 2025-04-11 Harshit Sikchi , Andrea Tirinzoni , Ahmed Touati , Yingchen Xu , Anssi Kanervisto , Scott Niekum , Amy Zhang , Alessandro Lazaric , Matteo Pirotta

Zero-Shot Reinforcement Learning Under Partial Observability

Recent work has shown that, under certain assumptions, zero-shot reinforcement learning (RL) methods can generalise to any unseen task in an environment after reward-free pre-training. Access to Markov states is one such assumption, yet, in…

Machine Learning · Computer Science 2025-06-19 Scott Jeen , Tom Bewley , Jonathan M. Cullen

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

We study reinforcement learning (RL) with no-reward demonstrations, a setting in which an RL agent has access to additional data from the interaction of other agents with the same environment. However, it has no access to the rewards or…

Machine Learning · Computer Science 2021-06-11 Angelos Filos , Clare Lyle , Yarin Gal , Sergey Levine , Natasha Jaques , Gregory Farquhar

On Reward-Free Reinforcement Learning with Linear Function Approximation

Reward-free reinforcement learning (RL) is a framework which is suitable for both the batch RL setting and the setting where there are many reward functions of interest. During the exploration phase, an agent collects samples without using…

Machine Learning · Computer Science 2020-06-22 Ruosong Wang , Simon S. Du , Lin F. Yang , Ruslan Salakhutdinov

Can We Really Learn One Representation to Optimize All Rewards?

As machine learning has moved towards leveraging large models as priors for downstream tasks, the community has debated the right form of prior for solving reinforcement learning (RL) problems. If one were to try to prefetch as much…

Machine Learning · Computer Science 2026-02-13 Chongyi Zheng , Royina Karegoudra Jayanth , Benjamin Eysenbach

Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models

Unsupervised reinforcement learning (RL) aims at pre-training agents that can solve a wide range of downstream tasks in complex environments. Despite recent advancements, existing approaches suffer from several limitations: they may require…

Machine Learning · Computer Science 2025-04-16 Andrea Tirinzoni , Ahmed Touati , Jesse Farebrother , Mateusz Guzek , Anssi Kanervisto , Yingchen Xu , Alessandro Lazaric , Matteo Pirotta

RLZero: Direct Policy Inference from Language Without In-Domain Supervision

The reward hypothesis states that all goals and purposes can be understood as the maximization of a received scalar reward signal. However, in practice, defining such a reward signal is notoriously difficult, as humans are often unable to…

Artificial Intelligence · Computer Science 2025-11-26 Harshit Sikchi , Siddhant Agarwal , Pranaya Jajoo , Samyak Parajuli , Caleb Chuck , Max Rudolph , Peter Stone , Amy Zhang , Scott Niekum

Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement Learning

Reinforcement learning (RL) has drawn increasing interests in recent years due to its tremendous success in various applications. However, standard RL algorithms can only be applied for single reward function, and cannot adapt to an unseen…

Machine Learning · Computer Science 2022-01-04 Ziyang Tang , Yihao Feng , Qiang Liu

Zero-Shot Learning by Generating Pseudo Feature Representations

Zero-shot learning (ZSL) is a challenging task aiming at recognizing novel classes without any training instances. In this paper we present a simple but high-performance ZSL approach by generating pseudo feature representations (GPFR).…

Computer Vision and Pattern Recognition · Computer Science 2018-09-11 Jiang Lu , Jin Li , Ziang Yan , Changshui Zhang

Learning Robust and Adaptive Real-World Continuous Control Using Simulation and Transfer Learning

We use model-free reinforcement learning, extensive simulation, and transfer learning to develop a continuous control algorithm that has good zero-shot performance in a real physical environment. We train a simulated agent to act optimally…

Artificial Intelligence · Computer Science 2018-03-09 M Ferguson , K. H. Law