English

Efficient Parallel Methods for Deep Reinforcement Learning

Machine Learning 2017-05-17 v2

Abstract

We propose a novel framework for efficient parallelization of deep reinforcement learning algorithms, enabling these algorithms to learn from multiple actors on a single machine. The framework is algorithm agnostic and can be applied to on-policy, off-policy, value based and policy gradient based algorithms. Given its inherent parallelism, the framework can be efficiently implemented on a GPU, allowing the usage of powerful models while significantly reducing training time. We demonstrate the effectiveness of our framework by implementing an advantage actor-critic algorithm on a GPU, using on-policy experiences and employing synchronous updates. Our algorithm achieves state-of-the-art performance on the Atari domain after only a few hours of training. Our framework thus opens the door for much faster experimentation on demanding problem domains. Our implementation is open-source and is made public at https://github.com/alfredvc/paac

Keywords

Cite

@article{arxiv.1705.04862,
  title  = {Efficient Parallel Methods for Deep Reinforcement Learning},
  author = {Alfredo V. Clemente and Humberto N. Castejón and Arjun Chandra},
  journal= {arXiv preprint arXiv:1705.04862},
  year   = {2017}
}
R2 v1 2026-06-22T19:46:12.192Z