Related papers: Explicit Explore-Exploit Algorithms in Continuous …

Auto-exploration for online reinforcement learning

The exploration-exploitation dilemma in reinforcement learning (RL) is a fundamental challenge to efficient RL algorithms. Existing algorithms for finite state and action discounted RL problems address this by assuming sufficient…

Machine Learning · Computer Science 2025-12-09 Caleb Ju , Guanghui Lan

Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration

Recent advancements in deep reinforcement learning (RL) have demonstrated notable progress in sample efficiency, spanning both model-based and model-free paradigms. Despite the identification and mitigation of specific bottlenecks in prior…

Machine Learning · Computer Science 2024-04-02 Yibo Wang , Jiang Zhao

Exploration via Planning for Information about the Optimal Trajectory

Many potential applications of reinforcement learning (RL) are stymied by the large numbers of samples required to learn an effective policy. This is especially true when applying RL to real-world control tasks, e.g. in the sciences or…

Machine Learning · Computer Science 2022-10-11 Viraj Mehta , Ian Char , Joseph Abbate , Rory Conlin , Mark D. Boyer , Stefano Ermon , Jeff Schneider , Willie Neiswanger

Exploration by Maximizing R\'enyi Entropy for Reward-Free RL Framework

Exploration is essential for reinforcement learning (RL). To face the challenges of exploration, we consider a reward-free RL framework that completely separates exploration from exploitation and brings new challenges for exploration…

Machine Learning · Computer Science 2020-12-11 Chuheng Zhang , Yuanying Cai , Longbo Huang , Jian Li

Explicit Explore, Exploit, or Escape ($E^4$): near-optimal safety-constrained reinforcement learning in polynomial time

In reinforcement learning (RL), an agent must explore an initially unknown environment in order to learn a desired behaviour. When RL agents are deployed in real world environments, safety is of primary concern. Constrained Markov decision…

Machine Learning · Computer Science 2022-06-24 David M. Bossens , Nicholas Bishop

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Motivated by the prevailing paradigm of using unsupervised learning for efficient exploration in reinforcement learning (RL) problems [tang2017exploration,bellemare2016unifying], we investigate when this paradigm is provably efficient. We…

Machine Learning · Computer Science 2020-12-02 Fei Feng , Ruosong Wang , Wotao Yin , Simon S. Du , Lin F. Yang

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

Reinforcement learning (RL) methods have been shown to be capable of learning intelligent behavior in rich domains. However, this has largely been done in simulated domains without adequate focus on the process of building the simulator. In…

Machine Learning · Computer Science 2019-10-24 Aditya Modi , Nan Jiang , Ambuj Tewari , Satinder Singh

Model-Free Active Exploration in Reinforcement Learning

We study the problem of exploration in Reinforcement Learning and present a novel model-free solution. We adopt an information-theoretical viewpoint and start from the instance-specific lower bound of the number of samples that have to be…

Machine Learning · Computer Science 2024-07-02 Alessio Russo , Alexandre Proutiere

Where-to-Learn: Analytical Policy Gradient Directed Exploration for On-Policy Robotic Reinforcement Learning

On-policy reinforcement learning (RL) algorithms have demonstrated great potential in robotic control, where effective exploration is crucial for efficient and high-quality policy learning. However, how to encourage the agent to explore the…

Robotics · Computer Science 2026-04-02 Leixin Chang , Xinchen Yao , Ben Liu , Liangjing Yang , Hua Chen

Reward-Free Exploration for Reinforcement Learning

Exploration is widely regarded as one of the most challenging aspects of reinforcement learning (RL), with many naive approaches succumbing to exponential sample complexity. To isolate the challenges of exploration, we propose a new…

Machine Learning · Computer Science 2020-02-10 Chi Jin , Akshay Krishnamurthy , Max Simchowitz , Tiancheng Yu

Exploration in Feature Space for Reinforcement Learning

The infamous exploration-exploitation dilemma is one of the oldest and most important problems in reinforcement learning (RL). Deliberate and effective exploration is necessary for RL agents to succeed in most environments. However, until…

Artificial Intelligence · Computer Science 2017-10-09 Suraj Narayanan Sasikumar

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

Model-based Reinforcement Learning (RL) is a popular learning paradigm due to its potential sample efficiency compared to model-free RL. However, existing empirical model-based RL approaches lack the ability to explore. This work studies a…

Machine Learning · Computer Science 2021-07-16 Yuda Song , Wen Sun

Reinforcement Learning with Probabilistically Complete Exploration

Balancing exploration and exploitation remains a key challenge in reinforcement learning (RL). State-of-the-art RL algorithms suffer from high sample complexity, particularly in the sparse reward case, where they can do no better than to…

Machine Learning · Computer Science 2020-01-22 Philippe Morere , Gilad Francis , Tom Blau , Fabio Ramos

Divide-and-Conquer Reinforcement Learning

Standard model-free deep reinforcement learning (RL) algorithms sample a new initial state for each trial, allowing them to optimize policies that can perform well even in highly stochastic environments. However, problems that exhibit…

Machine Learning · Computer Science 2018-04-30 Dibya Ghosh , Avi Singh , Aravind Rajeswaran , Vikash Kumar , Sergey Levine

Safe Continuous Control with Constrained Model-Based Policy Optimization

The applicability of reinforcement learning (RL) algorithms in real-world domains often requires adherence to safety constraints, a need difficult to address given the asymptotic nature of the classic RL optimization objective. In contrast…

Machine Learning · Computer Science 2021-04-15 Moritz A. Zanger , Karam Daaboul , J. Marius Zöllner

Toward Efficient Exploration by Large Language Model Agents

A burgeoning area within reinforcement learning (RL) is the design of sequential decision-making agents centered around large language models (LLMs). While autonomous decision-making agents powered by modern LLMs could facilitate numerous…

Machine Learning · Computer Science 2026-02-10 Dilip Arumugam , Thomas L. Griffiths

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

One of the challenges in online reinforcement learning (RL) is that the agent needs to trade off the exploration of the environment and the exploitation of the samples to optimize its behavior. Whether we optimize for regret, sample…

Machine Learning · Computer Science 2021-11-19 Jean Tarbouriech , Matteo Pirotta , Michal Valko , Alessandro Lazaric

LESSON: Learning to Integrate Exploration Strategies for Reinforcement Learning via an Option Framework

In this paper, a unified framework for exploration in reinforcement learning (RL) is proposed based on an option-critic model. The proposed framework learns to integrate a set of diverse exploration strategies so that the agent can…

Machine Learning · Computer Science 2024-09-10 Woojun Kim , Jeonghye Kim , Youngchul Sung

Exploratory State Representation Learning

Not having access to compact and meaningful representations is known to significantly increase the complexity of reinforcement learning (RL). For this reason, it can be useful to perform state representation learning (SRL) before tackling…

Machine Learning · Computer Science 2022-02-16 Astrid Merckling , Nicolas Perrin-Gilbert , Alex Coninx , Stéphane Doncieux

Model-Based Policy Gradients with Parameter-Based Exploration by Least-Squares Conditional Density Estimation

The goal of reinforcement learning (RL) is to let an agent learn an optimal control policy in an unknown environment so that future expected rewards are maximized. The model-free RL approach directly learns the policy based on data samples.…

Machine Learning · Statistics 2013-07-22 Syogo Mori , Voot Tangkaratt , Tingting Zhao , Jun Morimoto , Masashi Sugiyama