Nested LSTMs

Joel Ruben Antony Moniz; David Krueger

Nested LSTMs

Computation and Language 2018-02-01 v1 Machine Learning

Authors: Joel Ruben Antony Moniz , David Krueger

Abstract

We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiple levels of memory. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell, which has its own inner memory cell. Specifically, instead of computing the value of the (outer) memory cell as $c^{outer}_t = f_t \odot c_{t-1} + i_t \odot g_t$ , NLSTM memory cells use the concatenation $(f_t \odot c_{t-1}, i_t \odot g_t)$ as input to an inner LSTM (or NLSTM) memory cell, and set $c^{outer}_t$ = $h^{inner}_t$ . Nested LSTMs outperform both stacked and single-layer LSTMs with similar numbers of parameters in our experiments on various character-level language modeling tasks, and the inner memories of an LSTM learn longer term dependencies compared with the higher-level units of a stacked LSTM.

Keywords

long short-term memory memory hierarchy memristor

Cite

@article{arxiv.1801.10308,
  title  = {Nested LSTMs},
  author = {Joel Ruben Antony Moniz and David Krueger},
  journal= {arXiv preprint arXiv:1801.10308},
  year   = {2018}
}

Comments

Accepted at ACML 2017

Nested LSTMs

Abstract

Keywords

Cite

Comments

Related papers