Virtual Replay Cache

Brett Daley; Christopher Amato

Virtual Replay Cache

Machine Learning 2021-12-08 v1

Authors: Brett Daley , Christopher Amato

Abstract

Return caching is a recent strategy that enables efficient minibatch training with multistep estimators (e.g. the {\lambda}-return) for deep reinforcement learning. By precomputing return estimates in sequential batches and then storing the results in an auxiliary data structure for later sampling, the average computation spent per estimate can be greatly reduced. Still, the efficiency of return caching could be improved, particularly with regard to its large memory usage and repetitive data copies. We propose a new data structure, the Virtual Replay Cache (VRC), to address these shortcomings. When learning to play Atari 2600 games, the VRC nearly eliminates DQN({\lambda})'s cache memory footprint and slightly reduces the total training time on our hardware.

Keywords

memory hierarchy reinforcement learning computer architecture

Cite

@article{arxiv.2112.03421,
  title  = {Virtual Replay Cache},
  author = {Brett Daley and Christopher Amato},
  journal= {arXiv preprint arXiv:2112.03421},
  year   = {2021}
}

Comments

4 pages, 1 figure, 3 tables

Related papers

View all related →