Related papers: Model-Based Bayesian Exploration
All reinforcement learning algorithms must handle the trade-off between exploration and exploitation. Many state-of-the-art deep reinforcement learning methods use noise in the action selection, such as Gaussian noise in policy gradient…
A reinforcement learning agent tries to maximize its cumulative payoff by interacting in an unknown environment. It is important for the agent to explore suboptimal actions as well as to pick actions with highest known rewards. Yet, in…
An effective approach to exploration in reinforcement learning is to rely on an agent's uncertainty over the optimal policy, which can yield near-optimal exploration strategies in tabular settings. However, in non-tabular settings that…
The explore{exploit dilemma is one of the central challenges in Reinforcement Learning (RL). Bayesian RL solves the dilemma by providing the agent with information in the form of a prior distribution over environments; however, full…
Optimal probabilistic approach in reinforcement learning is computationally infeasible. Its simplification consisting in neglecting difference between true environment and its model estimated using limited number of observations causes…
We consider a ubiquitous scenario in the Internet economy when individual decision-makers (henceforth, agents) both produce and consume information as they make strategic choices in an uncertain environment. This creates a three-way…
We introduce exploration potential, a quantity that measures how much a reinforcement learning agent has explored its environment class. In contrast to information gain, exploration potential takes the problem's reward structure into…
Efficiently tackling multiple tasks within complex environment, such as those found in robot manipulation, remains an ongoing challenge in robotics and an opportunity for data-driven solutions, such as reinforcement learning (RL).…
Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The first employs a Bayesian framework, where optimality improves with increased computational time.…
We present a modular approach to reinforcement learning that uses a Bayesian representation of the uncertainty over models. The approach, BOSS (Best of Sampled Set), drives exploration by sampling multiple models from the posterior and…
Sequential decision tasks with incomplete information are characterized by the exploration problem; namely the trade-off between further exploration for learning more about the environment and immediate exploitation of the accrued…
Exploration is a difficult challenge in reinforcement learning and is of prime importance in sparse reward environments. However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on…
In this work, we address the challenge of data-efficient exploration in reinforcement learning by examining existing principled, information-theoretic approaches to intrinsic motivation. Specifically, we focus on a class of exploration…
Efficient exploration remains a challenging problem in reinforcement learning, especially for those tasks where rewards from environments are sparse. A commonly used approach for exploring such environments is to introduce some "intrinsic"…
Incomplete knowledge of the environment leads an agent to make decisions under uncertainty. One of the major dilemmas in Reinforcement Learning (RL) where an autonomous agent has to balance two contrasting needs in making its decisions is:…
Equipping artificial agents with useful exploration mechanisms remains a challenge to this day. Humans, on the other hand, seem to manage the trade-off between exploration and exploitation effortlessly. In the present article, we put…
Efficient exploration is an unsolved problem in Reinforcement Learning which is usually addressed by reactively rewarding the agent for fortuitously encountering novel situations. This paper introduces an efficient active exploration…
The Exploration-Exploitation tradeoff arises in Reinforcement Learning when one cannot tell if a policy is optimal. Then, there is a constant need to explore new actions instead of exploiting past experience. In practice, it is common to…
Data selection is essential for any data-based optimization technique, such as Reinforcement Learning. State-of-the-art sampling strategies for the experience replay buffer improve the performance of the Reinforcement Learning agent.…
We study the problem of exploration in Reinforcement Learning and present a novel model-free solution. We adopt an information-theoretical viewpoint and start from the instance-specific lower bound of the number of samples that have to be…