English

Bayesian Reinforcement Learning via Deep, Sparse Sampling

Machine Learning 2020-06-30 v4 Artificial Intelligence Machine Learning

Abstract

We address the problem of Bayesian reinforcement learning using efficient model-based online planning. We propose an optimism-free Bayes-adaptive algorithm to induce deeper and sparser exploration with a theoretical bound on its performance relative to the Bayes optimal policy, with a lower computational complexity. The main novelty is the use of a candidate policy generator, to generate long-term options in the planning tree (over beliefs), which allows us to create much sparser and deeper trees. Experimental results on different environments show that in comparison to the state-of-the-art, our algorithm is both computationally more efficient, and obtains significantly higher reward in discrete environments.

Keywords

Cite

@article{arxiv.1902.02661,
  title  = {Bayesian Reinforcement Learning via Deep, Sparse Sampling},
  author = {Divya Grover and Debabrota Basu and Christos Dimitrakakis},
  journal= {arXiv preprint arXiv:1902.02661},
  year   = {2020}
}

Comments

Published in AISTATS 2020