Related papers: Maelstrom Networks

Neural Random-Access Machines

In this paper, we propose and investigate a new neural network architecture called Neural Random Access Machine. It can manipulate and dereference pointers to an external variable-size random-access memory. The model is trained from pure…

Machine Learning · Computer Science 2016-02-11 Karol Kurach , Marcin Andrychowicz , Ilya Sutskever

Continual Learning by Modeling Intra-Class Variation

It has been observed that neural networks perform poorly when the data or tasks are presented sequentially. Unlike humans, neural networks suffer greatly from catastrophic forgetting, making it impossible to perform life-long learning. To…

Machine Learning · Computer Science 2023-01-31 Longhui Yu , Tianyang Hu , Lanqing Hong , Zhen Liu , Adrian Weller , Weiyang Liu

Deep Independently Recurrent Neural Network (IndRNN)

Recurrent neural networks (RNNs) are known to be difficult to train due to the gradient vanishing and exploding problems and thus difficult to learn long-term patterns and construct deep networks. To address these problems, this paper…

Computer Vision and Pattern Recognition · Computer Science 2020-12-10 Shuai Li , Wanqing Li , Chris Cook , Yanbo Gao

ReReLRP -- Remembering and Recognizing Tasks with LRP

Deep neural networks have revolutionized numerous research fields and applications. Despite their widespread success, a fundamental limitation known as catastrophic forgetting remains, where models fail to retain their ability to perform…

Machine Learning · Computer Science 2025-02-21 Karolina Bogacka , Maximilian Höfler , Maria Ganzha , Wojciech Samek , Katarzyna Wasielewska-Michniewska

Learning to Remember More with Less Memorization

Memory-augmented neural networks consisting of a neural controller and an external memory have shown potentials in long-term sequential learning. Current RAM-like memory models maintain memory accessing every timesteps, thus they do not…

Machine Learning · Computer Science 2019-03-21 Hung Le , Truyen Tran , Svetha Venkatesh

Learning Numeracy: Binary Arithmetic with Neural Turing Machines

One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by…

Neural and Evolutionary Computing · Computer Science 2024-12-20 Jacopo Castellini

Continual Learning Using Bayesian Neural Networks

Continual learning models allow to learn and adapt to new changes and tasks over time. However, in continual and sequential learning scenarios in which the models are trained using different data with various distributions, neural networks…

Machine Learning · Computer Science 2020-08-17 HongLin Li , Payam Barnaghi , Shirin Enshaeifar , Frieder Ganz

Adversarial Targeted Forgetting in Regularization and Generative Based Continual Learning Models

Continual (or "incremental") learning approaches are employed when additional knowledge or tasks need to be learned from subsequent batches or from streaming data. However these approaches are typically adversary agnostic, i.e., they do not…

Machine Learning · Computer Science 2021-02-17 Muhammad Umer , Robi Polikar

Explain to Not Forget: Defending Against Catastrophic Forgetting with XAI

The ability to continuously process and retain new information like we do naturally as humans is a feat that is highly sought after when training neural networks. Unfortunately, the traditional optimization algorithms often require large…

Machine Learning · Computer Science 2022-06-23 Sami Ede , Serop Baghdadlian , Leander Weber , An Nguyen , Dario Zanca , Wojciech Samek , Sebastian Lapuschkin

Geometry of naturalistic object representations in recurrent neural network models of working memory

Working memory is a central cognitive ability crucial for intelligent decision-making. Recent experimental and computational work studying working memory has primarily used categorical (i.e., one-hot) inputs, rather than ecologically…

Artificial Intelligence · Computer Science 2024-11-06 Xiaoxuan Lei , Takuya Ito , Pouya Bashivan

Implicit Bias of Linear RNNs

Contemporary wisdom based on empirical studies suggests that standard recurrent neural networks (RNNs) do not perform well on tasks requiring long-term memory. However, precise reasoning for this behavior is still unknown. This paper…

Machine Learning · Computer Science 2021-01-21 Melikasadat Emami , Mojtaba Sahraee-Ardakan , Parthe Pandit , Sundeep Rangan , Alyson K. Fletcher

Generative replay with feedback connections as a general strategy for continual learning

A major obstacle to developing artificial intelligence applications capable of true lifelong learning is that artificial neural networks quickly or catastrophically forget previously learned tasks when trained on a new one. Numerous methods…

Machine Learning · Computer Science 2019-04-18 Gido M. van de Ven , Andreas S. Tolias

Effects of Auxiliary Knowledge on Continual Learning

In Continual Learning (CL), a neural network is trained on a stream of data whose distribution changes over time. In this context, the main problem is how to learn new information without forgetting old knowledge (i.e., Catastrophic…

Machine Learning · Computer Science 2022-06-07 Giovanni Bellitto , Matteo Pennisi , Simone Palazzo , Lorenzo Bonicelli , Matteo Boschini , Simone Calderara , Concetto Spampinato

On the Dynamics of Learning Time-Aware Behavior with Recurrent Neural Networks

Recurrent Neural Networks (RNNs) have shown great success in modeling time-dependent patterns, but there is limited research on their learned representations of latent temporal features and the emergence of these representations during…

Machine Learning · Computer Science 2023-06-13 Peter DelMastro , Rushiv Arora , Edward Rietman , Hava T. Siegelmann

MomentumRNN: Integrating Momentum into Recurrent Neural Networks

Designing deep neural networks is an art that often involves an expensive search over candidate architectures. To overcome this for recurrent neural nets (RNNs), we establish a connection between the hidden state dynamics in an RNN and…

Machine Learning · Computer Science 2021-12-14 Tan M. Nguyen , Richard G. Baraniuk , Andrea L. Bertozzi , Stanley J. Osher , Bao Wang

Learning to Control Rapidly Changing Synaptic Connections: An Alternative Type of Memory in Sequence Processing Artificial Neural Networks

Short-term memory in standard, general-purpose, sequence-processing recurrent neural networks (RNNs) is stored as activations of nodes or "neurons." Generalising feedforward NNs to such RNNs is mathematically straightforward and natural,…

Neural and Evolutionary Computing · Computer Science 2022-11-18 Kazuki Irie , Jürgen Schmidhuber

Shared and Private VAEs with Generative Replay for Continual Learning

Continual learning tries to learn new tasks without forgetting previously learned ones. In reality, most of the existing artificial neural network(ANN) models fail, while humans do the same by remembering previous works throughout their…

Computer Vision and Pattern Recognition · Computer Science 2021-05-18 Subhankar Ghosh

Memformer: A Memory-Augmented Transformer for Sequence Modeling

Transformers have reached remarkable success in sequence modeling. However, these models have efficiency issues as they need to store all the history token-level representations as memory. We present Memformer, an efficient neural network…

Computation and Language · Computer Science 2022-04-14 Qingyang Wu , Zhenzhong Lan , Kun Qian , Jing Gu , Alborz Geramifard , Zhou Yu

Efficient Continual Learning with Modular Networks and Task-Driven Priors

Existing literature in Continual Learning (CL) has focused on overcoming catastrophic forgetting, the inability of the learner to recall how to perform tasks observed in the past. There are however other desirable properties of a CL system,…

Machine Learning · Computer Science 2021-02-15 Tom Veniat , Ludovic Denoyer , Marc'Aurelio Ranzato

When Continual Learning Moves to Memory: A Study of Experience Reuse in LLM Agents

Memory-augmented LLM agents offer an appealing shortcut to continual learning: rather than updating model parameters, they accumulate experience in external memory, seemingly sidestepping the stability-plasticity dilemma of parametric…

Machine Learning · Computer Science 2026-05-01 Qisheng Hu , Quanyu Long , Wenya Wang