English

Continual Auxiliary Task Learning

Machine Learning 2022-02-24 v1

Abstract

Learning auxiliary tasks, such as multiple predictions about the world, can provide many benefits to reinforcement learning systems. A variety of off-policy learning algorithms have been developed to learn such predictions, but as yet there is little work on how to adapt the behavior to gather useful data for those off-policy predictions. In this work, we investigate a reinforcement learning system designed to learn a collection of auxiliary tasks, with a behavior policy learning to take actions to improve those auxiliary predictions. We highlight the inherent non-stationarity in this continual auxiliary task learning problem, for both prediction learners and the behavior learner. We develop an algorithm based on successor features that facilitates tracking under non-stationary rewards, and prove the separation into learning successor features and rewards provides convergence rate improvements. We conduct an in-depth study into the resulting multi-prediction learning system.

Keywords

Cite

@article{arxiv.2202.11133,
  title  = {Continual Auxiliary Task Learning},
  author = {Matthew McLeod and Chunlok Lo and Matthew Schlegel and Andrew Jacobsen and Raksha Kumaraswamy and Martha White and Adam White},
  journal= {arXiv preprint arXiv:2202.11133},
  year   = {2022}
}

Comments

Neural Information Processing Systems 2021