Dynamic Subgoal-based Exploration via Bayesian Optimization

Yijia Wang; Matthias Poloczek; Daniel R. Jiang

Dynamic Subgoal-based Exploration via Bayesian Optimization

Optimization and Control 2023-10-13 v5 Machine Learning

Authors: Yijia Wang , Matthias Poloczek , Daniel R. Jiang

Abstract

Reinforcement learning in sparse-reward navigation environments with expensive and limited interactions is challenging and poses a need for effective exploration. Motivated by complex navigation tasks that require real-world training (when cheap simulators are not available), we consider an agent that faces an unknown distribution of environments and must decide on an exploration strategy. It may leverage a series of training environments to improve its policy before it is evaluated in a test environment drawn from the same environment distribution. Most existing approaches focus on fixed exploration strategies, while the few that view exploration as a meta-optimization problem tend to ignore the need for cost-efficient exploration. We propose a cost-aware Bayesian optimization approach that efficiently searches over a class of dynamic subgoal-based exploration strategies. The algorithm adjusts a variety of levers -- the locations of the subgoals, the length of each episode, and the number of replications per trial -- in order to overcome the challenges of sparse rewards, expensive interactions, and noise. An experimental evaluation demonstrates that the new approach outperforms existing baselines across a number of problem domains. We also provide a theoretical foundation and prove that the method asymptotically identifies a near-optimal subgoal design.

Keywords

optimization dynamic programming game theory

Cite

@article{arxiv.1910.09143,
  title  = {Dynamic Subgoal-based Exploration via Bayesian Optimization},
  author = {Yijia Wang and Matthias Poloczek and Daniel R. Jiang},
  journal= {arXiv preprint arXiv:1910.09143},
  year   = {2023}
}

Dynamic Subgoal-based Exploration via Bayesian Optimization

Abstract

Keywords

Cite

Related papers