Related papers: The many faces of optimism - Extended version

Beyond Optimism: Exploration With Partially Observable Rewards

Exploration in reinforcement learning (RL) remains an open challenge. RL algorithms rely on observing rewards to train the agent, and if informative rewards are sparse the agent learns slowly or may not learn at all. To improve exploration…

Machine Learning · Computer Science 2024-11-12 Simone Parisi , Alireza Kazemipour , Michael Bowling

A Short Survey on Probabilistic Reinforcement Learning

A reinforcement learning agent tries to maximize its cumulative payoff by interacting in an unknown environment. It is important for the agent to explore suboptimal actions as well as to pick actions with highest known rewards. Yet, in…

Machine Learning · Computer Science 2019-01-23 Reazul Hasan Russel

Optimistic Simulated Exploration as an Incentive for Real Exploration

Many reinforcement learning exploration techniques are overly optimistic and try to explore every state. Such exploration is impossible in environments with the unlimited number of states. I propose to use simulated exploration with an…

Machine Learning · Computer Science 2009-05-20 Ivo Danihelka

Nearly optimal exploration-exploitation decision thresholds

While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. In this paper, we first derive upper bounds for the utility of selecting different…

Artificial Intelligence · Computer Science 2018-06-06 Christos Dimitrakakis

Probabilistic Exploration in Planning while Learning

Sequential decision tasks with incomplete information are characterized by the exploration problem; namely the trade-off between further exploration for learning more about the environment and immediate exploitation of the accrued…

Artificial Intelligence · Computer Science 2013-02-21 Grigoris I. Karakoulas

Optimistic Proximal Policy Optimization

Reinforcement Learning, a machine learning framework for training an autonomous agent based on rewards, has shown outstanding results in various domains. However, it is known that learning a good policy is difficult in a domain where…

Machine Learning · Computer Science 2019-06-27 Takahisa Imagawa , Takuya Hiraoka , Yoshimasa Tsuruoka

Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling

Learning complex robot behavior through interactions with the environment necessitates principled exploration. Effective strategies should prioritize exploring regions of the state-action space that maximize rewards, with optimistic…

Machine Learning · Computer Science 2025-03-12 Jasmine Bayrooti , Carl Henrik Ek , Amanda Prorok

Exploration Conscious Reinforcement Learning Revisited

The Exploration-Exploitation tradeoff arises in Reinforcement Learning when one cannot tell if a policy is optimal. Then, there is a constant need to explore new actions instead of exploiting past experience. In practice, it is common to…

Machine Learning · Computer Science 2019-09-10 Lior Shani , Yonathan Efroni , Shie Mannor

Model-Based Bayesian Exploration

Reinforcement learning systems are often concerned with balancing exploration of untested actions against exploitation of actions that are known to be good. The benefit of exploration can be estimated using the classical notion of Value of…

Artificial Intelligence · Computer Science 2013-01-30 Richard Dearden , Nir Friedman , David Andre

Success Probability of Exploration: a Concrete Analysis of Learning Efficiency

Exploration has been a crucial part of reinforcement learning, yet several important questions concerning exploration efficiency are still not answered satisfactorily by existing analytical frameworks. These questions include exploration…

Machine Learning · Computer Science 2016-12-06 Liangpeng Zhang , Ke Tang , Xin Yao

Modeling Human Exploration Through Resource-Rational Reinforcement Learning

Equipping artificial agents with useful exploration mechanisms remains a challenge to this day. Humans, on the other hand, seem to manage the trade-off between exploration and exploitation effortlessly. In the present article, we put…

Machine Learning · Computer Science 2022-11-15 Marcel Binz , Eric Schulz

Explicit Explore-Exploit Algorithms in Continuous State Spaces

We present a new model-based algorithm for reinforcement learning (RL) which consists of explicit exploration and exploitation phases, and is applicable in large or infinite state spaces. The algorithm maintains a set of dynamics models…

Machine Learning · Computer Science 2019-12-03 Mikael Henaff

Satisficing Exploration for Deep Reinforcement Learning

A default assumption in the design of reinforcement-learning algorithms is that a decision-making agent always explores to learn optimal behavior. In sufficiently complex environments that approach the vastness and scale of the real world,…

Machine Learning · Computer Science 2024-07-23 Dilip Arumugam , Saurabh Kumar , Ramki Gummadi , Benjamin Van Roy

Optimistic World Models: Efficient Exploration in Model-Based Deep Reinforcement Learning

Efficient exploration remains a central challenge in reinforcement learning (RL), particularly in sparse-reward environments. We introduce Optimistic World Models (OWMs), a principled and scalable framework for optimistic exploration that…

Machine Learning · Computer Science 2026-02-11 Akshay Mete , Shahid Aamir Sheikh , Tzu-Hsiang Lin , Dileep Kalathil , P. R. Kumar

Constrained Exploration and Recovery from Experience Shaping

We consider the problem of reinforcement learning under safety requirements, in which an agent is trained to complete a given task, typically formalized as the maximization of a reward signal over time, while concurrently avoiding…

Machine Learning · Computer Science 2018-09-25 Tu-Hoa Pham , Giovanni De Magistris , Don Joven Agravante , Subhajit Chaudhury , Asim Munawar , Ryuki Tachibana

Towards Improving Exploration in Self-Imitation Learning using Intrinsic Motivation

Reinforcement Learning has emerged as a strong alternative to solve optimization tasks efficiently. The use of these algorithms highly depends on the feedback signals provided by the environment in charge of informing about how good (or…

Machine Learning · Computer Science 2022-12-01 Alain Andres , Esther Villar-Rodriguez , Javier Del Ser

Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration

Language model alignment (or, reinforcement learning) techniques that leverage active exploration -- deliberately encouraging the model to produce diverse, informative responses -- offer the promise of super-human capabilities. However,…

Machine Learning · Computer Science 2025-03-17 Dylan J. Foster , Zakaria Mhammedi , Dhruv Rohatgi

Diversity-Driven Exploration Strategy for Deep Reinforcement Learning

Efficient exploration remains a challenging research problem in reinforcement learning, especially when an environment contains large state spaces, deceptive local optima, or sparse rewards. To tackle this problem, we present a…

Artificial Intelligence · Computer Science 2018-10-30 Zhang-Wei Hong , Tzu-Yun Shann , Shih-Yang Su , Yi-Hsiang Chang , Chun-Yi Lee

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Model-based reinforcement learning algorithms with probabilistic dynamical models are amongst the most data-efficient learning methods. This is often attributed to their ability to distinguish between epistemic and aleatoric uncertainty.…

Machine Learning · Computer Science 2020-12-02 Sebastian Curi , Felix Berkenkamp , Andreas Krause

Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning

The exploration \& exploitation dilemma poses significant challenges in reinforcement learning (RL). Recently, curiosity-based exploration methods achieved great success in tackling hard-exploration problems. However, they necessitate…

Machine Learning · Computer Science 2024-12-06 Yiran Wang , Chenshu Liu , Yunfan Li , Sanae Amani , Bolei Zhou , Lin F. Yang