Temporal Difference Variational Auto-Encoder

Karol Gregor; George Papamakarios; Frederic Besse; Lars Buesing; Theophane Weber

Temporal Difference Variational Auto-Encoder

Machine Learning 2019-01-03 v3 Machine Learning

Authors: Karol Gregor , George Papamakarios , Frederic Besse , Lars Buesing , Theophane Weber

Abstract

To act and plan in complex environments, we posit that agents should have a mental simulator of the world with three characteristics: (a) it should build an abstract state representing the condition of the world; (b) it should form a belief which represents uncertainty on the world; (c) it should go beyond simple step-by-step simulation, and exhibit temporal abstraction. Motivated by the absence of a model satisfying all these requirements, we propose TD-VAE, a generative sequence model that learns representations containing explicit beliefs about states several steps into the future, and that can be rolled out directly without single-step transitions. TD-VAE is trained on pairs of temporally separated time points, using an analogue of temporal difference learning used in reinforcement learning.

Keywords

variational autoencoder temporal graph multi-agent reinforcement learning

Cite

@article{arxiv.1806.03107,
  title  = {Temporal Difference Variational Auto-Encoder},
  author = {Karol Gregor and George Papamakarios and Frederic Besse and Lars Buesing and Theophane Weber},
  journal= {arXiv preprint arXiv:1806.03107},
  year   = {2019}
}

Temporal Difference Variational Auto-Encoder

Abstract

Keywords

Cite

Related papers