Related papers: Bounding the Optimal Value Function in Composition…

Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning

In reinforcement learning (RL), the ability to utilize prior knowledge from previously solved tasks can allow agents to quickly solve new problems. In some cases, these new problems may be approximately solved by composing the solutions of…

Machine Learning · Computer Science 2023-01-02 Jacob Adamczyk , Argenis Arriojas , Stas Tiomkin , Rahul V. Kulkarni

Will it Blend? Composing Value Functions in Reinforcement Learning

An important property for lifelong-learning agents is the ability to combine existing skills to solve unseen tasks. In general, however, it is unclear how to compose skills in a principled way. We provide a "recipe" for optimal value…

Machine Learning · Computer Science 2018-07-13 Benjamin van Niekerk , Steven James , Adam Earle , Benjamin Rosman

Constrained Reinforcement Learning Has Zero Duality Gap

Autonomous agents must often deal with conflicting requirements, such as completing tasks using the least amount of time/energy, learning multiple tasks, or dealing with multiple opponents. In the context of reinforcement learning~(RL),…

Machine Learning · Computer Science 2019-10-30 Santiago Paternain , Luiz F. O. Chamon , Miguel Calvo-Fullana , Alejandro Ribeiro

Leveraging Prior Knowledge in Reinforcement Learning via Double-Sided Bounds on the Value Function

An agent's ability to leverage past experience is critical for efficiently solving new tasks. Approximate solutions for new tasks can be obtained from previously derived value functions, as demonstrated by research on transfer learning,…

Machine Learning · Computer Science 2023-09-06 Jacob Adamczyk , Stas Tiomkin , Rahul Kulkarni

Safety-Aware Task Composition for Discrete and Continuous Reinforcement Learning

Compositionality is a critical aspect of scalable system design. Reinforcement learning (RL) has recently shown substantial success in task learning, but has only recently begun to truly leverage composition. In this paper, we focus on…

Machine Learning · Computer Science 2023-06-30 Kevin Leahy , Makai Mann , Zachary Serlin

Robust Subtask Learning for Compositional Generalization

Compositional reinforcement learning is a promising approach for training policies to perform complex long-horizon tasks. Typically, a high-level task is decomposed into a sequence of subtasks and a separate policy is trained to perform…

Machine Learning · Computer Science 2023-06-09 Kishor Jothimurugan , Steve Hsu , Osbert Bastani , Rajeev Alur

Composing Entropic Policies using Divergence Correction

Composing previously mastered skills to solve novel tasks promises dramatic improvements in the data efficiency of reinforcement learning. Here, we analyze two recent works composing behaviors represented in the form of action-value…

Machine Learning · Computer Science 2019-07-08 Jonathan J Hunt , Andre Barreto , Timothy P Lillicrap , Nicolas Heess

A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving

Reinforcement learning has emerged as an important approach for autonomous driving. A reward function is used in reinforcement learning to establish the learned skill objectives and guide the agent toward the optimal policy. Since…

Robotics · Computer Science 2026-03-05 Ahmed Abouelazm , Jonas Michel , J. Marius Zoellner

A Unified Framework for Zero-Shot Reinforcement Learning

Zero-shot reinforcement learning (RL) has emerged as a setting for developing general agents, capable of solving downstream tasks without additional training or planning at test-time. While conventional RL optimizes policies for fixed…

Machine Learning · Computer Science 2026-03-10 Jacopo Di Ventura , Jan Felix Kleuker , Aske Plaat , Thomas Moerland

Reinforcement Learning for Compositional Generalization with Outcome-Level Optimization

Compositional generalization refers to correctly interpret novel combinations of known primitives, which remains a major challenge. Existing approaches often rely on supervised fine-tuning, which encourages models to imitate target outputs.…

Machine Learning · Computer Science 2026-05-07 Xiyan Fu , Wei Liu

Towards Task-Prioritized Policy Composition

Combining learned policies in a prioritized, ordered manner is desirable because it allows for modular design and facilitates data reuse through knowledge transfer. In control theory, prioritized composition is realized by null-space…

Machine Learning · Computer Science 2022-09-21 Finn Rietz , Erik Schaffernicht , Todor Stoyanov , Johannes A. Stork

To the Max: Reinventing Reward in Reinforcement Learning

In reinforcement learning (RL), different reward functions can define the same optimal policy but result in drastically different learning performance. For some, the agent gets stuck with a suboptimal behavior, and for others, it solves the…

Machine Learning · Computer Science 2025-02-25 Grigorii Veviurko , Wendelin Böhmer , Mathijs de Weerdt

On Value Functions and the Agent-Environment Boundary

When function approximation is deployed in reinforcement learning (RL), the same problem may be formulated in different ways, often by treating a pre-processing step as a part of the environment or as part of the agent. As a consequence,…

Machine Learning · Computer Science 2020-06-02 Nan Jiang

Modular Lifelong Reinforcement Learning via Neural Composition

Humans commonly solve complex problems by decomposing them into easier subproblems and then combining the subproblem solutions. This type of compositional reasoning permits reuse of the subproblem solutions when tackling future tasks that…

Machine Learning · Computer Science 2022-07-04 Jorge A. Mendez , Harm van Seijen , Eric Eaton

A Boolean Task Algebra for Reinforcement Learning

The ability to compose learned skills to solve new tasks is an important property of lifelong-learning agents. In this work, we formalise the logical composition of tasks as a Boolean algebra. This allows us to formulate new tasks in terms…

Machine Learning · Computer Science 2020-10-16 Geraud Nangue Tasse , Steven James , Benjamin Rosman

Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning

Humans are capable of abstracting various tasks as different combinations of multiple attributes. This perspective of compositionality is vital for human rapid learning and adaption since previous experiences from related tasks can be…

Robotics · Computer Science 2022-10-04 Zheng Wu , Yichen Xie , Wenzhao Lian , Changhao Wang , Yanjiang Guo , Jianyu Chen , Stefan Schaal , Masayoshi Tomizuka

On Zero-Shot Reinforcement Learning

Modern reinforcement learning (RL) systems capture deep truths about general, human problem-solving. In domains where new data can be simulated cheaply, these systems uncover sequential decision-making policies that far exceed the ability…

Machine Learning · Computer Science 2025-10-07 Scott Jeen

Compositional Reinforcement Learning for Discrete-Time Stochastic Control Systems

We propose a compositional approach to synthesize policies for networks of continuous-space stochastic control systems with unknown dynamics using model-free reinforcement learning (RL). The approach is based on implicitly abstracting each…

Systems and Control · Electrical Eng. & Systems 2022-08-09 Abolfazl Lavaei , Mateo Perez , Milad Kazemi , Fabio Somenzi , Sadegh Soudjani , Ashutosh Trivedi , Majid Zamani

Environment Generation for Zero-Shot Compositional Reinforcement Learning

Many real-world problems are compositional - solving them requires completing interdependent sub-tasks, either in series or in parallel, that can be represented as a dependency graph. Deep reinforcement learning (RL) agents often struggle…

Machine Learning · Computer Science 2022-01-25 Izzeddin Gur , Natasha Jaques , Yingjie Miao , Jongwook Choi , Manoj Tiwari , Honglak Lee , Aleksandra Faust

Verifiable Reinforcement Learning Systems via Compositionality

We propose a framework for verifiable and compositional reinforcement learning (RL) in which a collection of RL subsystems, each of which learns to accomplish a separate subtask, are composed to achieve an overall task. The framework…

Systems and Control · Electrical Eng. & Systems 2023-09-13 Cyrus Neary , Aryaman Singh Samyal , Christos Verginis , Murat Cubuktepe , Ufuk Topcu