Related papers: Training Agents using Upside-Down Reinforcement Le…

All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL

Upside down reinforcement learning (UDRL) flips the conventional use of the return in the objective function in RL upside down, by taking returns as input and predicting actions. UDRL is based purely on supervised learning, and bypasses…

Machine Learning · Computer Science 2022-02-25 Kai Arulkumaran , Dylan R. Ashley , Jürgen Schmidhuber , Rupesh K. Srivastava

Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions

We transform reinforcement learning (RL) into a form of supervised learning (SL) by turning traditional RL on its head, calling this Upside Down RL (UDRL). Standard RL predicts rewards, while UDRL instead uses rewards as task-defining…

Artificial Intelligence · Computer Science 2020-06-24 Juergen Schmidhuber

Upside-Down Reinforcement Learning for More Interpretable Optimal Control

Model-Free Reinforcement Learning (RL) algorithms either learn how to map states to expected rewards or search for policies that can maximize a certain performance function. Model-Based algorithms instead, aim to learn an approximation of…

Machine Learning · Computer Science 2024-11-19 Juan Cardenas-Cartagena , Massimiliano Falzari , Marco Zullich , Matthia Sabatelli

Upside-Down Reinforcement Learning Can Diverge in Stochastic Environments With Episodic Resets

Upside-Down Reinforcement Learning (UDRL) is an approach for solving RL problems that does not require value functions and uses only supervised learning, where the targets for given inputs in a dataset do not change over time. Ghosh et al.…

Machine Learning · Statistics 2022-05-16 Miroslav Štrupl , Francesco Faccio , Dylan R. Ashley , Jürgen Schmidhuber , Rupesh Kumar Srivastava

Upside Down Reinforcement Learning with Policy Generators

Upside Down Reinforcement Learning (UDRL) is a promising framework for solving reinforcement learning problems which focuses on learning command-conditioned policies. In this work, we extend UDRL to the task of learning a…

Machine Learning · Computer Science 2025-01-29 Jacopo Di Ventura , Dylan R. Ashley , Vincent Herrmann , Francesco Faccio , Jürgen Schmidhuber

Reward-Conditioned Reinforcement Learning

Single-task RL agents are typically trained under a fixed reward function, which limits their robustness to reward misspecification and their ability to adapt to changing preferences. We introduce Reward-Conditioned Reinforcement Learning…

Machine Learning · Computer Science 2026-05-20 Michal Nauman , Marek Cygan , Pieter Abbeel

Reinforcement Learning for UAV control with Policy and Reward Shaping

In recent years, unmanned aerial vehicle (UAV) related technology has expanded knowledge in the area, bringing to light new problems and challenges that require solutions. Furthermore, because the technology allows processes usually carried…

Artificial Intelligence · Computer Science 2022-12-08 Cristian Millán-Arias , Ruben Contreras , Francisco Cruz , Bruno Fernandes

Enhanced Penalty-based Bidirectional Reinforcement Learning Algorithms

This research focuses on enhancing reinforcement learning (RL) algorithms by integrating penalty functions to guide agents in avoiding unwanted actions while optimizing rewards. The goal is to improve the learning process by ensuring that…

Machine Learning · Computer Science 2025-04-07 Sai Gana Sandeep Pula , Sathish A. P. Kumar , Sumit Jha , Arvind Ramanathan

Robust Deep Reinforcement Learning Through Adversarial Attacks and Training : A Survey

Deep Reinforcement Learning (DRL) is a subfield of machine learning for training autonomous agents that take sequential actions across complex environments. Despite its significant performance in well-known environments, it remains…

Machine Learning · Computer Science 2024-12-12 Lucas Schott , Josephine Delas , Hatem Hajri , Elies Gherbi , Reda Yaich , Nora Boulahia-Cuppens , Frederic Cuppens , Sylvain Lamprier

Evolving Rewards to Automate Reinforcement Learning

Many continuous control tasks have easily formulated objectives, yet using them directly as a reward in reinforcement learning (RL) leads to suboptimal policies. Therefore, many classical control tasks guide RL training using complex…

Machine Learning · Computer Science 2019-05-21 Aleksandra Faust , Anthony Francis , Dar Mehta

Reinforcement Learning in Economics and Finance

Reinforcement learning algorithms describe how an agent can learn an optimal action policy in a sequential decision process, through repeated experience. In a given environment, the agent policy provides him some running and terminal…

Theoretical Economics · Economics 2020-03-24 Arthur Charpentier , Romuald Elie , Carl Remlinger

Information Directed Reward Learning for Reinforcement Learning

For many reinforcement learning (RL) applications, specifying a reward is difficult. This paper considers an RL setting where the agent obtains information about the reward only by querying an expert that can, for example, evaluate…

Machine Learning · Computer Science 2022-02-01 David Lindner , Matteo Turchetta , Sebastian Tschiatschek , Kamil Ciosek , Andreas Krause

Autonomous Reinforcement Learning via Subgoal Curricula

Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents. However, the success of current reinforcement learning algorithms is predicated on an often under-emphasised requirement -- each…

Machine Learning · Computer Science 2021-10-29 Archit Sharma , Abhishek Gupta , Sergey Levine , Karol Hausman , Chelsea Finn

Backward Curriculum Reinforcement Learning

Current reinforcement learning algorithms train an agent using forward-generated trajectories, which provide little guidance so that the agent can explore as much as possible. While realizing the value of reinforcement learning results from…

Artificial Intelligence · Computer Science 2023-09-06 KyungMin Ko

A Differential Perspective on Distributional Reinforcement Learning

To date, distributional reinforcement learning (distributional RL) methods have exclusively focused on the discounted setting, where an agent aims to optimize a discounted sum of rewards over time. In this work, we extend distributional RL…

Machine Learning · Computer Science 2026-01-14 Juan Sebastian Rojas , Chi-Guhn Lee

Adversarial Agent Behavior Learning in Autonomous Driving Using Deep Reinforcement Learning

Existing approaches in reinforcement learning train an agent to learn desired optimal behavior in an environment with rule based surrounding agents. In safety critical applications such as autonomous driving it is crucial that the rule…

Computer Vision and Pattern Recognition · Computer Science 2025-08-22 Arjun Srinivasan , Anubhav Paras , Aniket Bera

A Survey on Reinforcement Learning Methods in Character Animation

Reinforcement Learning is an area of Machine Learning focused on how agents can be trained to make sequential decisions, and achieve a particular goal within an arbitrary environment. While learning, they repeatedly take actions based on…

Graphics · Computer Science 2022-05-26 Ariel Kwiatkowski , Eduardo Alvarado , Vicky Kalogeiton , C. Karen Liu , Julien Pettré , Michiel van de Panne , Marie-Paule Cani

Semi-supervised reward learning for offline reinforcement learning

In offline reinforcement learning (RL) agents are trained using a logged dataset. It appears to be the most natural route to attack real-life applications because in domains such as healthcare and robotics interactions with the environment…

Machine Learning · Computer Science 2020-12-15 Ksenia Konyushkova , Konrad Zolna , Yusuf Aytar , Alexander Novikov , Scott Reed , Serkan Cabi , Nando de Freitas

Directionality Reinforcement Learning to Operate Multi-Agent System without Communication

This paper establishes directionality reinforcement learning (DRL) technique to propose the complete decentralized multi-agent reinforcement learning method which can achieve cooperation based on each agent's learning: no communication and…

Multiagent Systems · Computer Science 2021-10-13 Fumito Uwano , Keiki Takadama

Autonomous Reinforcement Learning: Formalism and Benchmarking

Reinforcement learning (RL) provides a naturalistic framing for learning through trial and error, which is appealing both because of its simplicity and effectiveness and because of its resemblance to how humans and animals acquire skills…

Machine Learning · Computer Science 2022-08-09 Archit Sharma , Kelvin Xu , Nikhil Sardana , Abhishek Gupta , Karol Hausman , Sergey Levine , Chelsea Finn