English

Memory-Efficient Backpropagation Through Time

Neural and Evolutionary Computing 2016-06-13 v1 Machine Learning

Abstract

We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of intermediate results and recomputation. The algorithm is capable of tightly fitting within almost any user-set memory budget while finding an optimal execution policy minimizing the computational cost. Computational devices have limited memory capacity and maximizing a computational performance given a fixed memory budget is a practical use-case. We provide asymptotic computational upper bounds for various regimes. The algorithm is particularly effective for long sequences. For sequences of length 1000, our algorithm saves 95\% of memory usage while using only one third more time per iteration than the standard BPTT.

Keywords

Cite

@article{arxiv.1606.03401,
  title  = {Memory-Efficient Backpropagation Through Time},
  author = {Audrūnas Gruslys and Remi Munos and Ivo Danihelka and Marc Lanctot and Alex Graves},
  journal= {arXiv preprint arXiv:1606.03401},
  year   = {2016}
}
R2 v1 2026-06-22T14:22:43.505Z