Optimizing Data Collection in Deep Reinforcement Learning

James Gleeson; Daniel Snider; Yvonne Yang; Moshe Gabel; Eyal de Lara; Gennady Pekhimenko

Optimizing Data Collection in Deep Reinforcement Learning

Machine Learning 2022-07-19 v1

Authors: James Gleeson , Daniel Snider , Yvonne Yang , Moshe Gabel , Eyal de Lara , Gennady Pekhimenko

Abstract

Reinforcement learning (RL) workloads take a notoriously long time to train due to the large number of samples collected at run-time from simulators. Unfortunately, cluster scale-up approaches remain expensive, and commonly used CPU implementations of simulators induce high overhead when switching back and forth between GPU computations. We explore two optimizations that increase RL data collection efficiency by increasing GPU utilization: (1) GPU vectorization: parallelizing simulation on the GPU for increased hardware parallelism, and (2) simulator kernel fusion: fusing multiple simulation steps to run in a single GPU kernel launch to reduce global memory bandwidth requirements. We find that GPU vectorization can achieve up to $1024\times$ speedup over commonly used CPU simulators. We profile the performance of different implementations and show that for a simple simulator, ML compiler implementations (XLA) of GPU vectorization outperform a DNN framework (PyTorch) by $13.4\times$ by reducing CPU overhead from repeated Python to DL backend API calls. We show that simulator kernel fusion speedups with a simple simulator are $11.3\times$ and increase by up to $1024\times$ as simulator complexity increases in terms of memory bandwidth requirements. We show that the speedups from simulator kernel fusion are orthogonal and combinable with GPU vectorization, leading to a multiplicative speedup.

Keywords

gpu computing large language model inference large language model training

Cite

@article{arxiv.2207.07736,
  title  = {Optimizing Data Collection in Deep Reinforcement Learning},
  author = {James Gleeson and Daniel Snider and Yvonne Yang and Moshe Gabel and Eyal de Lara and Gennady Pekhimenko},
  journal= {arXiv preprint arXiv:2207.07736},
  year   = {2022}
}

Comments

MLBench 2022 ( https://memani1.github.io/mlbench22/ ) camera ready submission

Optimizing Data Collection in Deep Reinforcement Learning

Abstract

Keywords

Cite

Comments

Related papers