Related papers: Encoding-based Memory Modules for Recurrent Neural…

Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory

The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be…

Machine Learning · Computer Science 2020-07-01 Antonio Carta , Alessandro Sperduti , Davide Bacciu

Short-Term Memory Optimization in Recurrent Neural Networks by Autoencoder-based Initialization

Training RNNs to learn long-term dependencies is difficult due to vanishing gradients. We explore an alternative solution based on explicit memorization using linear autoencoders for sequences, which allows to maximize the short-term memory…

Machine Learning · Computer Science 2020-11-06 Antonio Carta , Alessandro Sperduti , Davide Bacciu

Linear Memory Networks

Recurrent neural networks can learn complex transduction problems that require maintaining and actively exploiting a memory of their inputs. Such models traditionally consider memory and input-output functionalities indissolubly entangled.…

Machine Learning · Computer Science 2018-11-09 Davide Bacciu , Antonio Carta , Alessandro Sperduti

Learning Sequence Attractors in Recurrent Networks with Hidden Neurons

The brain is targeted for processing temporal sequence information. It remains largely unclear how the brain learns to store and retrieve sequence memories. Here, we study how recurrent networks of binary neurons learn sequence attractors…

Neural and Evolutionary Computing · Computer Science 2024-04-04 Yao Lu , Si Wu

Learning Longer Memory in Recurrent Neural Networks

Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due…

Neural and Evolutionary Computing · Computer Science 2015-04-20 Tomas Mikolov , Armand Joulin , Sumit Chopra , Michael Mathieu , Marc'Aurelio Ranzato

Continuous Learning in a Hierarchical Multiscale Neural Network

We reformulate the problem of encoding a multi-scale representation of a sequence in a language model by casting it in a continuous learning framework. We propose a hierarchical multi-scale language model in which short time-scale…

Computation and Language · Computer Science 2018-05-16 Thomas Wolf , Julien Chaumond , Clement Delangue

Overparameterized Neural Networks Implement Associative Memory

Identifying computational mechanisms for memorization and retrieval of data is a long-standing problem at the intersection of machine learning and neuroscience. Our main finding is that standard overparameterized deep neural networks…

Machine Learning · Computer Science 2022-05-25 Adityanarayanan Radhakrishnan , Mikhail Belkin , Caroline Uhler

Recurrent Network Models Of Sequence Generation And Memory

Sequential activation of neurons is a common feature of network activity during a variety of behaviors, including working memory and decision making. Previous network models for sequences and memory emphasized specialized architectures in…

Neurons and Cognition · Quantitative Biology 2016-03-16 Kanaka Rajan , Christopher D Harvey , David W Tank

Modeling the dynamics of human brain activity with recurrent neural networks

Encoding models are used for predicting brain activity in response to sensory stimuli with the objective of elucidating how sensory information is represented in the brain. Encoding models typically comprise a nonlinear transformation of…

Neurons and Cognition · Quantitative Biology 2017-03-13 Umut Güçlü , Marcel A. J. van Gerven

Context- and Sequence-Aware Convolutional Recurrent Encoder for Neural Machine Translation

Neural Machine Translation model is a sequence-to-sequence converter based on neural networks. Existing models use recurrent neural networks to construct both the encoder and decoder modules. In alternative research, the recurrent networks…

Computation and Language · Computer Science 2021-05-04 Ritam Mallick , Seba Susan , Vaibhaw Agrawal , Rizul Garg , Prateek Rawal

Slow manifolds in recurrent networks encode working memory efficiently and robustly

Working memory is a cognitive function involving the storage and manipulation of latent information over brief intervals of time, thus making it crucial for context-dependent computation. Here, we use a top-down modeling approach to examine…

Neurons and Cognition · Quantitative Biology 2021-11-17 Elham Ghazizadeh , ShiNung Ching

Learning to update Auto-associative Memory in Recurrent Neural Networks for Improving Sequence Memorization

Learning to remember long sequences remains a challenging task for recurrent neural networks. Register memory and attention mechanisms were both proposed to resolve the issue with either high computational cost to retain memory…

Artificial Intelligence · Computer Science 2017-10-04 Wei Zhang , Bowen Zhou

Higher Order Recurrent Neural Networks

In this paper, we study novel neural network structures to better model long term dependency in sequential data. We propose to use more memory units to keep track of more preceding states in recurrent neural networks (RNNs), which are all…

Neural and Evolutionary Computing · Computer Science 2016-05-03 Rohollah Soltani , Hui Jiang

On Training Recurrent Networks with Truncated Backpropagation Through Time in Speech Recognition

Recurrent neural networks have been the dominant models for many speech and language processing tasks. However, we understand little about the behavior and the class of functions recurrent networks can realize. Moreover, the heuristics used…

Computation and Language · Computer Science 2018-11-01 Hao Tang , James Glass

Learning Sequence Representations by Non-local Recurrent Neural Memory

The key challenge of sequence representation learning is to capture the long-range temporal dependencies. Typical methods for supervised sequence representation learning are built upon recurrent neural networks to capture temporal…

Computer Vision and Pattern Recognition · Computer Science 2022-07-21 Wenjie Pei , Xin Feng , Canmiao Fu , Qiong Cao , Guangming Lu , Yu-Wing Tai

Memformer: A Memory-Augmented Transformer for Sequence Modeling

Transformers have reached remarkable success in sequence modeling. However, these models have efficiency issues as they need to store all the history token-level representations as memory. We present Memformer, an efficient neural network…

Computation and Language · Computer Science 2022-04-14 Qingyang Wu , Zhenzhong Lan , Kun Qian , Jing Gu , Alborz Geramifard , Zhou Yu

Hierarchical Attention Encoder Decoder

Recent advances in large language models have shown that autoregressive modeling can generate complex and novel sequences that have many real-world applications. However, these models must generate outputs autoregressively, which becomes…

Machine Learning · Computer Science 2023-06-05 Asier Mujika

Understanding How Encoder-Decoder Architectures Attend

Encoder-decoder networks with attention have proven to be a powerful way to solve many sequence-to-sequence tasks. In these networks, attention aligns encoder and decoder states and is often used for visualizing network behavior. However,…

Machine Learning · Computer Science 2021-10-29 Kyle Aitken , Vinay V Ramasesh , Yuan Cao , Niru Maheswaranathan

Discriminative Recurrent Sparse Auto-Encoders

We present the discriminative recurrent sparse auto-encoder model, comprising a recurrent encoder of rectified linear units, unrolled for a fixed number of iterations, and connected to two linear decoders that reconstruct the input and…

Machine Learning · Computer Science 2013-03-20 Jason Tyler Rolfe , Yann LeCun

Convolutional Sequence to Sequence Learning

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to…

Computation and Language · Computer Science 2017-07-26 Jonas Gehring , Michael Auli , David Grangier , Denis Yarats , Yann N. Dauphin