English
Related papers

Related papers: Encoding-based Memory Modules for Recurrent Neural…

200 papers

The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be…

Machine Learning · Computer Science 2020-07-01 Antonio Carta , Alessandro Sperduti , Davide Bacciu

Training RNNs to learn long-term dependencies is difficult due to vanishing gradients. We explore an alternative solution based on explicit memorization using linear autoencoders for sequences, which allows to maximize the short-term memory…

Machine Learning · Computer Science 2020-11-06 Antonio Carta , Alessandro Sperduti , Davide Bacciu

Recurrent neural networks can learn complex transduction problems that require maintaining and actively exploiting a memory of their inputs. Such models traditionally consider memory and input-output functionalities indissolubly entangled.…

Machine Learning · Computer Science 2018-11-09 Davide Bacciu , Antonio Carta , Alessandro Sperduti

The brain is targeted for processing temporal sequence information. It remains largely unclear how the brain learns to store and retrieve sequence memories. Here, we study how recurrent networks of binary neurons learn sequence attractors…

Neural and Evolutionary Computing · Computer Science 2024-04-04 Yao Lu , Si Wu

Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due…

Neural and Evolutionary Computing · Computer Science 2015-04-20 Tomas Mikolov , Armand Joulin , Sumit Chopra , Michael Mathieu , Marc'Aurelio Ranzato

We reformulate the problem of encoding a multi-scale representation of a sequence in a language model by casting it in a continuous learning framework. We propose a hierarchical multi-scale language model in which short time-scale…

Computation and Language · Computer Science 2018-05-16 Thomas Wolf , Julien Chaumond , Clement Delangue

Identifying computational mechanisms for memorization and retrieval of data is a long-standing problem at the intersection of machine learning and neuroscience. Our main finding is that standard overparameterized deep neural networks…

Machine Learning · Computer Science 2022-05-25 Adityanarayanan Radhakrishnan , Mikhail Belkin , Caroline Uhler

Sequential activation of neurons is a common feature of network activity during a variety of behaviors, including working memory and decision making. Previous network models for sequences and memory emphasized specialized architectures in…

Neurons and Cognition · Quantitative Biology 2016-03-16 Kanaka Rajan , Christopher D Harvey , David W Tank

Encoding models are used for predicting brain activity in response to sensory stimuli with the objective of elucidating how sensory information is represented in the brain. Encoding models typically comprise a nonlinear transformation of…

Neurons and Cognition · Quantitative Biology 2017-03-13 Umut Güçlü , Marcel A. J. van Gerven

Neural Machine Translation model is a sequence-to-sequence converter based on neural networks. Existing models use recurrent neural networks to construct both the encoder and decoder modules. In alternative research, the recurrent networks…

Computation and Language · Computer Science 2021-05-04 Ritam Mallick , Seba Susan , Vaibhaw Agrawal , Rizul Garg , Prateek Rawal

Working memory is a cognitive function involving the storage and manipulation of latent information over brief intervals of time, thus making it crucial for context-dependent computation. Here, we use a top-down modeling approach to examine…

Neurons and Cognition · Quantitative Biology 2021-11-17 Elham Ghazizadeh , ShiNung Ching

Learning to remember long sequences remains a challenging task for recurrent neural networks. Register memory and attention mechanisms were both proposed to resolve the issue with either high computational cost to retain memory…

Artificial Intelligence · Computer Science 2017-10-04 Wei Zhang , Bowen Zhou

In this paper, we study novel neural network structures to better model long term dependency in sequential data. We propose to use more memory units to keep track of more preceding states in recurrent neural networks (RNNs), which are all…

Neural and Evolutionary Computing · Computer Science 2016-05-03 Rohollah Soltani , Hui Jiang

Recurrent neural networks have been the dominant models for many speech and language processing tasks. However, we understand little about the behavior and the class of functions recurrent networks can realize. Moreover, the heuristics used…

Computation and Language · Computer Science 2018-11-01 Hao Tang , James Glass

The key challenge of sequence representation learning is to capture the long-range temporal dependencies. Typical methods for supervised sequence representation learning are built upon recurrent neural networks to capture temporal…

Computer Vision and Pattern Recognition · Computer Science 2022-07-21 Wenjie Pei , Xin Feng , Canmiao Fu , Qiong Cao , Guangming Lu , Yu-Wing Tai

Transformers have reached remarkable success in sequence modeling. However, these models have efficiency issues as they need to store all the history token-level representations as memory. We present Memformer, an efficient neural network…

Computation and Language · Computer Science 2022-04-14 Qingyang Wu , Zhenzhong Lan , Kun Qian , Jing Gu , Alborz Geramifard , Zhou Yu

Recent advances in large language models have shown that autoregressive modeling can generate complex and novel sequences that have many real-world applications. However, these models must generate outputs autoregressively, which becomes…

Machine Learning · Computer Science 2023-06-05 Asier Mujika

Encoder-decoder networks with attention have proven to be a powerful way to solve many sequence-to-sequence tasks. In these networks, attention aligns encoder and decoder states and is often used for visualizing network behavior. However,…

Machine Learning · Computer Science 2021-10-29 Kyle Aitken , Vinay V Ramasesh , Yuan Cao , Niru Maheswaranathan

We present the discriminative recurrent sparse auto-encoder model, comprising a recurrent encoder of rectified linear units, unrolled for a fixed number of iterations, and connected to two linear decoders that reconstruct the input and…

Machine Learning · Computer Science 2013-03-20 Jason Tyler Rolfe , Yann LeCun

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to…

Computation and Language · Computer Science 2017-07-26 Jonas Gehring , Michael Auli , David Grangier , Denis Yarats , Yann N. Dauphin
‹ Prev 1 2 3 10 Next ›