Related papers: Exploration Potential

Exploration Conscious Reinforcement Learning Revisited

The Exploration-Exploitation tradeoff arises in Reinforcement Learning when one cannot tell if a policy is optimal. Then, there is a constant need to explore new actions instead of exploiting past experience. In practice, it is common to…

Machine Learning · Computer Science 2019-09-10 Lior Shani , Yonathan Efroni , Shie Mannor

Exploration Unbound

A sequential decision-making agent balances between exploring to gain new knowledge about an environment and exploiting current knowledge to maximize immediate reward. For environments studied in the traditional literature, optimal…

Machine Learning · Computer Science 2024-07-23 Dilip Arumugam , Wanqiao Xu , Benjamin Van Roy

A Short Survey on Probabilistic Reinforcement Learning

A reinforcement learning agent tries to maximize its cumulative payoff by interacting in an unknown environment. It is important for the agent to explore suboptimal actions as well as to pick actions with highest known rewards. Yet, in…

Machine Learning · Computer Science 2019-01-23 Reazul Hasan Russel

Model-Based Bayesian Exploration

Reinforcement learning systems are often concerned with balancing exploration of untested actions against exploitation of actions that are known to be good. The benefit of exploration can be estimated using the classical notion of Value of…

Artificial Intelligence · Computer Science 2013-01-30 Richard Dearden , Nir Friedman , David Andre

Exploitation Is All You Need... for Exploration

Ensuring sufficient exploration is a central challenge when training meta-reinforcement learning (meta-RL) agents to solve novel environments. Conventional solutions to the exploration-exploitation dilemma inject explicit incentives such as…

Machine Learning · Computer Science 2025-08-05 Micah Rentschler , Jesse Roberts

Exploration and Incentives in Reinforcement Learning

How do you incentivize self-interested agents to $\textit{explore}$ when they prefer to $\textit{exploit}$? We consider complex exploration problems, where each agent faces the same (but unknown) MDP. In contrast with traditional…

Machine Learning · Computer Science 2023-02-21 Max Simchowitz , Aleksandrs Slivkins

Meta-Learning to Explore via Memory Density Feedback

Exploration algorithms for reinforcement learning typically replace or augment the reward function with an additional ``intrinsic'' reward that trains the agent to seek previously unseen states of the environment. Here, we consider an…

Machine Learning · Computer Science 2025-09-30 Kevin McKee , Eric Alt , Andrew Grebenisan , Mick van Gelderen , Gary Miguel

Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits

We introduce the "inverse bandit" problem of estimating the rewards of a multi-armed bandit instance from observing the learning process of a low-regret demonstrator. Existing approaches to the related problem of inverse reinforcement…

Machine Learning · Statistics 2022-02-23 Wenshuo Guo , Kumar Krishna Agrawal , Aditya Grover , Vidya Muthukumar , Ashwin Pananjady

Modeling Human Exploration Through Resource-Rational Reinforcement Learning

Equipping artificial agents with useful exploration mechanisms remains a challenge to this day. Humans, on the other hand, seem to manage the trade-off between exploration and exploitation effortlessly. In the present article, we put…

Machine Learning · Computer Science 2022-11-15 Marcel Binz , Eric Schulz

Nearly optimal exploration-exploitation decision thresholds

While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. In this paper, we first derive upper bounds for the utility of selecting different…

Artificial Intelligence · Computer Science 2018-06-06 Christos Dimitrakakis

Beyond Optimism: Exploration With Partially Observable Rewards

Exploration in reinforcement learning (RL) remains an open challenge. RL algorithms rely on observing rewards to train the agent, and if informative rewards are sparse the agent learns slowly or may not learn at all. To improve exploration…

Machine Learning · Computer Science 2024-11-12 Simone Parisi , Alireza Kazemipour , Michael Bowling

Diversity-Driven Selection of Exploration Strategies in Multi-Armed Bandits

We consider a scenario where an agent has multiple available strategies to explore an unknown environment. For each new interaction with the environment, the agent must select which exploration strategy to use. We provide a new…

Machine Learning · Computer Science 2018-08-24 Fabien C. Y. Benureau , Pierre-Yves Oudeyer

Learning to Explore by Reinforcement over High-Level Options

Autonomous 3D environment exploration is a fundamental task for various applications such as navigation. The goal of exploration is to investigate a new environment and build its occupancy map efficiently. In this paper, we propose a new…

Artificial Intelligence · Computer Science 2021-11-03 Liu Juncheng , McCane Brendan , Mills Steven

A Survey of Exploration Methods in Reinforcement Learning

Exploration is an essential component of reinforcement learning algorithms, where agents need to learn how to predict and control unknown and often stochastic environments. Reinforcement learning agents depend crucially on exploration to…

Machine Learning · Computer Science 2021-09-03 Susan Amin , Maziar Gomrokchi , Harsh Satija , Herke van Hoof , Doina Precup

Constrained Exploration and Recovery from Experience Shaping

We consider the problem of reinforcement learning under safety requirements, in which an agent is trained to complete a given task, typically formalized as the maximization of a reward signal over time, while concurrently avoiding…

Machine Learning · Computer Science 2018-09-25 Tu-Hoa Pham , Giovanni De Magistris , Don Joven Agravante , Subhajit Chaudhury , Asim Munawar , Ryuki Tachibana

Hyper-parameter Tuning for the Contextual Bandit

We study here the problem of learning the exploration exploitation trade-off in the contextual bandit problem with linear reward function setting. In the traditional algorithms that solve the contextual bandit problem, the exploration is a…

Machine Learning · Computer Science 2020-05-06 Djallel Bouneffouf , Emmanuelle Claeys

Understanding the Origin of Information-Seeking Exploration in Probabilistic Objectives for Control

The exploration-exploitation trade-off is central to the description of adaptive behaviour in fields ranging from machine learning, to biology, to economics. While many approaches have been taken, one approach to solving this trade-off has…

Machine Learning · Computer Science 2021-11-29 Beren Millidge , Anil Seth , Christopher Buckley

The Role of Exploration for Task Transfer in Reinforcement Learning

The exploration--exploitation trade-off in reinforcement learning (RL) is a well-known and much-studied problem that balances greedy action selection with novel experience, and the study of exploration methods is usually only considered in…

Machine Learning · Computer Science 2022-10-13 Jonathan C Balloch , Julia Kim , and Jessica L Inman , Mark O Riedl

Curiosity Killed or Incapacitated the Cat and the Asymptotically Optimal Agent

Reinforcement learners are agents that learn to pick actions that lead to high reward. Ideally, the value of a reinforcement learner's policy approaches optimality--where the optimal informed policy is the one which maximizes reward.…

Machine Learning · Computer Science 2021-05-27 Michael K. Cohen , Elliot Catt , Marcus Hutter

Satisficing Exploration for Deep Reinforcement Learning

A default assumption in the design of reinforcement-learning algorithms is that a decision-making agent always explores to learn optimal behavior. In sufficiently complex environments that approach the vastness and scale of the real world,…

Machine Learning · Computer Science 2024-07-23 Dilip Arumugam , Saurabh Kumar , Ramki Gummadi , Benjamin Van Roy