Related papers: Contextual Exploration Using a Linear Approximatio…

Neural Risk-sensitive Satisficing in Contextual Bandits

The contextual bandit problem, which is a type of reinforcement learning tasks, provides an effective framework for solving challenges in recommendation systems, such as satisfying real-time requirements, enabling personalization,…

Machine Learning · Computer Science 2025-01-16 Shogo Ito , Tatsuji Takahashi , Yu Kono

Satisficing Exploration for Deep Reinforcement Learning

A default assumption in the design of reinforcement-learning algorithms is that a decision-making agent always explores to learn optimal behavior. In sufficiently complex environments that approach the vastness and scale of the real world,…

Machine Learning · Computer Science 2024-07-23 Dilip Arumugam , Saurabh Kumar , Ramki Gummadi , Benjamin Van Roy

Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation

Document summarisation can be formulated as a sequential decision-making problem, which can be solved by Reinforcement Learning (RL) algorithms. The predominant RL paradigm for summarisation learns a cross-input policy, which requires…

Computation and Language · Computer Science 2019-07-31 Yang Gao , Christian M. Meyer , Mohsen Mesgar , Iryna Gurevych

Guaranteed satisficing and finite regret: Analysis of a cognitive satisficing value function

As reinforcement learning algorithms are being applied to increasingly complicated and realistic tasks, it is becoming increasingly difficult to solve such problems within a practical time frame. Hence, we focus on a \textit{satisficing}…

Artificial Intelligence · Computer Science 2025-04-16 Akihiro Tamatsukuri , Tatsuji Takahashi

Adaptive trajectory-constrained exploration strategy for deep reinforcement learning

Deep reinforcement learning (DRL) faces significant challenges in addressing the hard-exploration problems in tasks with sparse or deceptive rewards and large state spaces. These challenges severely limit the practical application of DRL.…

Machine Learning · Computer Science 2024-01-03 Guojian Wang , Faguo Wu , Xiao Zhang , Ning Guo , Zhiming Zheng

Intrinsically Motivated Reinforcement Learning based Recommendation with Counterfactual Data Augmentation

Deep reinforcement learning (DRL) has been proven its efficiency in capturing users' dynamic interests in recent literature. However, training a DRL agent is challenging, because of the sparse environment in recommender systems (RS), DRL…

Information Retrieval · Computer Science 2022-09-20 Xiaocong Chen , Siyu Wang , Lina Yao , Lianyong Qi , Yong Li

Overcoming Exploration in Reinforcement Learning with Demonstrations

Exploration in environments with sparse rewards has been a persistent problem in reinforcement learning (RL). Many tasks are natural to specify with a sparse reward, and manually shaping a reward function can result in suboptimal…

Machine Learning · Computer Science 2018-02-27 Ashvin Nair , Bob McGrew , Marcin Andrychowicz , Wojciech Zaremba , Pieter Abbeel

Reinforcement Learning with Probabilistically Complete Exploration

Balancing exploration and exploitation remains a key challenge in reinforcement learning (RL). State-of-the-art RL algorithms suffer from high sample complexity, particularly in the sparse reward case, where they can do no better than to…

Machine Learning · Computer Science 2020-01-22 Philippe Morere , Gilad Francis , Tom Blau , Fabio Ramos

Explicit Explore-Exploit Algorithms in Continuous State Spaces

We present a new model-based algorithm for reinforcement learning (RL) which consists of explicit exploration and exploitation phases, and is applicable in large or infinite state spaces. The algorithm maintains a set of dynamics models…

Machine Learning · Computer Science 2019-12-03 Mikael Henaff

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

Sparse-reward reinforcement learning (RL) can model a wide range of highly complex tasks. Solving sparse-reward tasks is RL's core premise, requiring efficient exploration coupled with long-horizon credit assignment, and overcoming these…

Machine Learning · Computer Science 2025-10-21 Leander Diaz-Bone , Marco Bagatella , Jonas Hübotter , Andreas Krause

Prompted Policy Search: Reinforcement Learning through Linguistic and Numerical Reasoning in LLMs

Reinforcement Learning (RL) traditionally relies on scalar reward signals, limiting its ability to leverage the rich semantic knowledge often available in real-world tasks. In contrast, humans learn efficiently by combining numerical…

Machine Learning · Computer Science 2025-12-01 Yifan Zhou , Sachin Grover , Mohamed El Mistiri , Kamalesh Kalirathnam , Pratyush Kerhalkar , Swaroop Mishra , Neelesh Kumar , Sanket Gaurav , Oya Aran , Heni Ben Amor

Meta-Reinforcement Learning of Structured Exploration Strategies

Exploration is a fundamental challenge in reinforcement learning (RL). Many of the current exploration methods for deep RL use task-agnostic objectives, such as information gain or bonuses based on state visitation. However, many practical…

Machine Learning · Computer Science 2018-02-21 Abhishek Gupta , Russell Mendonca , YuXuan Liu , Pieter Abbeel , Sergey Levine

Diversity-Driven Exploration Strategy for Deep Reinforcement Learning

Efficient exploration remains a challenging research problem in reinforcement learning, especially when an environment contains large state spaces, deceptive local optima, or sparse rewards. To tackle this problem, we present a…

Artificial Intelligence · Computer Science 2018-10-30 Zhang-Wei Hong , Tzu-Yun Shann , Shih-Yang Su , Yi-Hsiang Chang , Chun-Yi Lee

Reinforcement Learning with a Focus on Adjusting Policies to Reach Targets

The objective of a reinforcement learning agent is to discover better actions through exploration. However, typical exploration techniques aim to maximize rewards, often incurring high costs in both exploration and learning processes. We…

Machine Learning · Computer Science 2024-12-24 Akane Tsuboya , Yu Kono , Tatsuji Takahashi

Toward Efficient Exploration by Large Language Model Agents

A burgeoning area within reinforcement learning (RL) is the design of sequential decision-making agents centered around large language models (LLMs). While autonomous decision-making agents powered by modern LLMs could facilitate numerous…

Machine Learning · Computer Science 2026-02-10 Dilip Arumugam , Thomas L. Griffiths

Resilient Constrained Reinforcement Learning

We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before training. It is challenging to identify appropriate constraint specifications due to the undefined…

Optimization and Control · Mathematics 2024-01-02 Dongsheng Ding , Zhengyan Huan , Alejandro Ribeiro

Overcoming Exploration: Deep Reinforcement Learning for Continuous Control in Cluttered Environments from Temporal Logic Specifications

Model-free continuous control for robot navigation tasks using Deep Reinforcement Learning (DRL) that relies on noisy policies for exploration is sensitive to the density of rewards. In practice, robots are usually deployed in cluttered…

Robotics · Computer Science 2023-02-24 Mingyu Cai , Erfan Aasi , Calin Belta , Cristian-Ioan Vasile

Information-Directed Exploration for Deep Reinforcement Learning

Efficient exploration remains a major challenge for reinforcement learning. One reason is that the variability of the returns often depends on the current state and action, and is therefore heteroscedastic. Classical exploration strategies…

Machine Learning · Computer Science 2019-03-26 Nikolay Nikolov , Johannes Kirschner , Felix Berkenkamp , Andreas Krause

Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration

Language model alignment (or, reinforcement learning) techniques that leverage active exploration -- deliberately encouraging the model to produce diverse, informative responses -- offer the promise of super-human capabilities. However,…

Machine Learning · Computer Science 2025-03-17 Dylan J. Foster , Zakaria Mhammedi , Dhruv Rohatgi

Efficient Exploration in Deep Reinforcement Learning: A Novel Bayesian Actor-Critic Algorithm

Reinforcement learning (RL) and Deep Reinforcement Learning (DRL), in particular, have the potential to disrupt and are already changing the way we interact with the world. One of the key indicators of their applicability is their ability…

Machine Learning · Computer Science 2024-08-20 Nikolai Rozanov