Causally Correct Partial Models for Reinforcement Learning

Danilo J. Rezende; Ivo Danihelka; George Papamakarios; Nan Rosemary Ke; Ray Jiang; Theophane Weber; Karol Gregor; Hamza Merzic; Fabio Viola; Jane Wang; Jovana Mitrovic; Frederic Besse; Ioannis Antonoglou; Lars Buesing

Causally Correct Partial Models for Reinforcement Learning

Machine Learning 2020-02-10 v1 Artificial Intelligence Machine Learning

Authors: Danilo J. Rezende , Ivo Danihelka , George Papamakarios , Nan Rosemary Ke , Ray Jiang , Theophane Weber , Karol Gregor , Hamza Merzic , Fabio Viola , Jane Wang , Jovana Mitrovic , Frederic Besse , Ioannis Antonoglou , Lars Buesing

View on arXiv ↗ PDF ↗

Abstract

In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this paper, we show that partial models can be causally incorrect: they are confounded by the observations they don't model, and can therefore lead to incorrect planning. To address this, we introduce a general family of partial models that are provably causally correct, yet remain fast because they do not need to fully model future observations.

Keywords

machine learning theory causal inference machine learning

Cite

@article{arxiv.2002.02836,
  title  = {Causally Correct Partial Models for Reinforcement Learning},
  author = {Danilo J. Rezende and Ivo Danihelka and George Papamakarios and Nan Rosemary Ke and Ray Jiang and Theophane Weber and Karol Gregor and Hamza Merzic and Fabio Viola and Jane Wang and Jovana Mitrovic and Frederic Besse and Ioannis Antonoglou and Lars Buesing},
  journal= {arXiv preprint arXiv:2002.02836},
  year   = {2020}
}

Causally Correct Partial Models for Reinforcement Learning

Abstract

Keywords

Cite

Related papers