Related papers: Sparse Reward Processes

Dealing with Sparse Rewards in Reinforcement Learning

Successfully navigating a complex environment to obtain a desired outcome is a difficult task, that up to recently was believed to be capable only by humans. This perception has been broken down over time, especially with the introduction…

Machine Learning · Computer Science 2019-11-12 Joshua Hare

Learning to Bid Long-Term: Multi-Agent Reinforcement Learning with Long-Term and Sparse Reward in Repeated Auction Games

We propose a multi-agent distributed reinforcement learning algorithm that balances between potentially conflicting short-term reward and sparse, delayed long-term reward, and learns with partial information in a dynamic environment. We…

Machine Learning · Computer Science 2022-04-06 Jing Tan , Ramin Khalili , Holger Karl

Online Learning with Costly Features in Non-stationary Environments

Maximizing long-term rewards is the primary goal in sequential decision-making problems. The majority of existing methods assume that side information is freely available, enabling the learning agent to observe all features' states before…

Machine Learning · Computer Science 2023-07-19 Saeed Ghoorchian , Evgenii Kortukov , Setareh Maghsudi

Reinforced Imitation in Heterogeneous Action Space

Imitation learning is an effective alternative approach to learn a policy when the reward function is sparse. In this paper, we consider a challenging setting where an agent and an expert use different actions from each other. We assume…

Machine Learning · Computer Science 2019-08-27 Konrad Zolna , Negar Rostamzadeh , Yoshua Bengio , Sungjin Ahn , Pedro O. Pinheiro

Open-Ended Learning Leads to Generally Capable Agents

In this work we create agents that can perform well beyond a single, individual task, that exhibit much wider generalisation of behaviour to a massive, rich space of challenges. We define a universe of tasks within an environment domain and…

Machine Learning · Computer Science 2021-08-03 Open Ended Learning Team , Adam Stooke , Anuj Mahajan , Catarina Barros , Charlie Deck , Jakob Bauer , Jakub Sygnowski , Maja Trebacz , Max Jaderberg , Michael Mathieu , Nat McAleese , Nathalie Bradley-Schmieg , Nathaniel Wong , Nicolas Porcel , Roberta Raileanu , Steph Hughes-Fitt , Valentin Dalibard , Wojciech Marian Czarnecki

Online reinforcement learning with sparse rewards through an active inference capsule

Intelligent agents must pursue their goals in complex environments with partial information and often limited computational capacity. Reinforcement learning methods have achieved great success by creating agents that optimize engineered…

Machine Learning · Computer Science 2021-06-07 Alejandro Daniel Noel , Charel van Hoof , Beren Millidge

Sequential Strategic Classification with Multi-Stage Selective Classifiers

Strategic classification studies the problem where self-interested individuals or agents manipulate their response to obtain favorable decision outcomes made by classifiers, typically turning to dishonest actions when they are less costly…

Machine Learning · Computer Science 2026-05-07 Ziyuan Huang , Lina Alkarmi , Mingyan Liu

Asymptotic Learnability of Reinforcement Problems with Arbitrary Dependence

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions. The task for an agent is to attain the best possible asymptotic reward where the…

Machine Learning · Computer Science 2007-05-23 Daniil Ryabko , Marcus Hutter

On the Possibility of Learning in Reactive Environments with Arbitrary Dependence

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO)MDPs. The task for an agent is to attain…

Machine Learning · Computer Science 2009-12-30 Daniil Ryabko , Marcus Hutter

Hindsight policy gradients

A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy. In addition to their potential to generalize desirable behavior to unseen goals, such policies may also enable…

Machine Learning · Computer Science 2019-02-21 Paulo Rauber , Avinash Ummadisingu , Filipe Mutz , Juergen Schmidhuber

Curiosity-Driven Exploration via Latent Bayesian Surprise

The human intrinsic desire to pursue knowledge, also known as curiosity, is considered essential in the process of skill acquisition. With the aid of artificial curiosity, we could equip current techniques for control, such as Reinforcement…

Machine Learning · Computer Science 2022-02-24 Pietro Mazzaglia , Ozan Catal , Tim Verbelen , Bart Dhoedt

Learning to Design Games: Strategic Environments in Reinforcement Learning

In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment. In this paper, we extend this…

Artificial Intelligence · Computer Science 2019-10-25 Haifeng Zhang , Jun Wang , Zhiming Zhou , Weinan Zhang , Ying Wen , Yong Yu , Wenxin Li

Towards Improving Exploration in Self-Imitation Learning using Intrinsic Motivation

Reinforcement Learning has emerged as a strong alternative to solve optimization tasks efficiently. The use of these algorithms highly depends on the feedback signals provided by the environment in charge of informing about how good (or…

Machine Learning · Computer Science 2022-12-01 Alain Andres , Esther Villar-Rodriguez , Javier Del Ser

A Short Survey on Probabilistic Reinforcement Learning

A reinforcement learning agent tries to maximize its cumulative payoff by interacting in an unknown environment. It is important for the agent to explore suboptimal actions as well as to pick actions with highest known rewards. Yet, in…

Machine Learning · Computer Science 2019-01-23 Reazul Hasan Russel

Towards better dense rewards in Reinforcement Learning Applications

Finding meaningful and accurate dense rewards is a fundamental task in the field of reinforcement learning (RL) that enables agents to explore environments more efficiently. In traditional RL settings, agents learn optimal policies through…

Artificial Intelligence · Computer Science 2025-12-05 Shuyuan Zhang

Intrinsic Motivation for Encouraging Synergistic Behavior

We study the role of intrinsic motivation as an exploration bias for reinforcement learning in sparse-reward synergistic tasks, which are tasks where multiple agents must work together to achieve a goal they could not individually. Our key…

Machine Learning · Computer Science 2020-02-14 Rohan Chitnis , Shubham Tulsiani , Saurabh Gupta , Abhinav Gupta

Collaborative Training of Heterogeneous Reinforcement Learning Agents in Environments with Sparse Rewards: What and When to Share?

In the early stages of human life, babies develop their skills by exploring different scenarios motivated by their inherent satisfaction rather than by extrinsic rewards from the environment. This behavior, referred to as intrinsic…

Machine Learning · Computer Science 2022-02-25 Alain Andres , Esther Villar-Rodriguez , Javier Del Ser

The problem with DDPG: understanding failures in deterministic environments with sparse rewards

In environments with continuous state and action spaces, state-of-the-art actor-critic reinforcement learning algorithms can solve very complex problems, yet can also fail in environments that seem trivial, but the reason for such failures…

Machine Learning · Computer Science 2022-06-10 Guillaume Matheron , Nicolas Perrin , Olivier Sigaud

Probabilistic inverse reinforcement learning in unknown environments

We consider the problem of learning by demonstration from agents acting in unknown stochastic Markov environments or games. Our aim is to estimate agent preferences in order to construct improved policies for the same task that the agents…

Machine Learning · Computer Science 2014-08-12 Aristide Tossou , Christos Dimitrakakis

Probabilistic inverse reinforcement learning in unknown environments

We consider the problem of learning by demonstration from agents acting in unknown stochastic Markov environments or games. Our aim is to estimate agent preferences in order to construct improved policies for the same task that the agents…

Machine Learning · Statistics 2013-07-16 Aristide C. Y. Tossou , Christos Dimitrakakis