English
Related papers

Related papers: Maelstrom Networks

200 papers

In this paper, we propose and investigate a new neural network architecture called Neural Random Access Machine. It can manipulate and dereference pointers to an external variable-size random-access memory. The model is trained from pure…

Machine Learning · Computer Science 2016-02-11 Karol Kurach , Marcin Andrychowicz , Ilya Sutskever

It has been observed that neural networks perform poorly when the data or tasks are presented sequentially. Unlike humans, neural networks suffer greatly from catastrophic forgetting, making it impossible to perform life-long learning. To…

Machine Learning · Computer Science 2023-01-31 Longhui Yu , Tianyang Hu , Lanqing Hong , Zhen Liu , Adrian Weller , Weiyang Liu

Recurrent neural networks (RNNs) are known to be difficult to train due to the gradient vanishing and exploding problems and thus difficult to learn long-term patterns and construct deep networks. To address these problems, this paper…

Computer Vision and Pattern Recognition · Computer Science 2020-12-10 Shuai Li , Wanqing Li , Chris Cook , Yanbo Gao

Deep neural networks have revolutionized numerous research fields and applications. Despite their widespread success, a fundamental limitation known as catastrophic forgetting remains, where models fail to retain their ability to perform…

Machine Learning · Computer Science 2025-02-21 Karolina Bogacka , Maximilian Höfler , Maria Ganzha , Wojciech Samek , Katarzyna Wasielewska-Michniewska

Memory-augmented neural networks consisting of a neural controller and an external memory have shown potentials in long-term sequential learning. Current RAM-like memory models maintain memory accessing every timesteps, thus they do not…

Machine Learning · Computer Science 2019-03-21 Hung Le , Truyen Tran , Svetha Venkatesh

One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by…

Neural and Evolutionary Computing · Computer Science 2024-12-20 Jacopo Castellini

Continual learning models allow to learn and adapt to new changes and tasks over time. However, in continual and sequential learning scenarios in which the models are trained using different data with various distributions, neural networks…

Machine Learning · Computer Science 2020-08-17 HongLin Li , Payam Barnaghi , Shirin Enshaeifar , Frieder Ganz

Continual (or "incremental") learning approaches are employed when additional knowledge or tasks need to be learned from subsequent batches or from streaming data. However these approaches are typically adversary agnostic, i.e., they do not…

Machine Learning · Computer Science 2021-02-17 Muhammad Umer , Robi Polikar

The ability to continuously process and retain new information like we do naturally as humans is a feat that is highly sought after when training neural networks. Unfortunately, the traditional optimization algorithms often require large…

Machine Learning · Computer Science 2022-06-23 Sami Ede , Serop Baghdadlian , Leander Weber , An Nguyen , Dario Zanca , Wojciech Samek , Sebastian Lapuschkin

Working memory is a central cognitive ability crucial for intelligent decision-making. Recent experimental and computational work studying working memory has primarily used categorical (i.e., one-hot) inputs, rather than ecologically…

Artificial Intelligence · Computer Science 2024-11-06 Xiaoxuan Lei , Takuya Ito , Pouya Bashivan

Contemporary wisdom based on empirical studies suggests that standard recurrent neural networks (RNNs) do not perform well on tasks requiring long-term memory. However, precise reasoning for this behavior is still unknown. This paper…

Machine Learning · Computer Science 2021-01-21 Melikasadat Emami , Mojtaba Sahraee-Ardakan , Parthe Pandit , Sundeep Rangan , Alyson K. Fletcher

A major obstacle to developing artificial intelligence applications capable of true lifelong learning is that artificial neural networks quickly or catastrophically forget previously learned tasks when trained on a new one. Numerous methods…

Machine Learning · Computer Science 2019-04-18 Gido M. van de Ven , Andreas S. Tolias

In Continual Learning (CL), a neural network is trained on a stream of data whose distribution changes over time. In this context, the main problem is how to learn new information without forgetting old knowledge (i.e., Catastrophic…

Recurrent Neural Networks (RNNs) have shown great success in modeling time-dependent patterns, but there is limited research on their learned representations of latent temporal features and the emergence of these representations during…

Machine Learning · Computer Science 2023-06-13 Peter DelMastro , Rushiv Arora , Edward Rietman , Hava T. Siegelmann

Designing deep neural networks is an art that often involves an expensive search over candidate architectures. To overcome this for recurrent neural nets (RNNs), we establish a connection between the hidden state dynamics in an RNN and…

Machine Learning · Computer Science 2021-12-14 Tan M. Nguyen , Richard G. Baraniuk , Andrea L. Bertozzi , Stanley J. Osher , Bao Wang

Short-term memory in standard, general-purpose, sequence-processing recurrent neural networks (RNNs) is stored as activations of nodes or "neurons." Generalising feedforward NNs to such RNNs is mathematically straightforward and natural,…

Neural and Evolutionary Computing · Computer Science 2022-11-18 Kazuki Irie , Jürgen Schmidhuber

Continual learning tries to learn new tasks without forgetting previously learned ones. In reality, most of the existing artificial neural network(ANN) models fail, while humans do the same by remembering previous works throughout their…

Computer Vision and Pattern Recognition · Computer Science 2021-05-18 Subhankar Ghosh

Transformers have reached remarkable success in sequence modeling. However, these models have efficiency issues as they need to store all the history token-level representations as memory. We present Memformer, an efficient neural network…

Computation and Language · Computer Science 2022-04-14 Qingyang Wu , Zhenzhong Lan , Kun Qian , Jing Gu , Alborz Geramifard , Zhou Yu

Existing literature in Continual Learning (CL) has focused on overcoming catastrophic forgetting, the inability of the learner to recall how to perform tasks observed in the past. There are however other desirable properties of a CL system,…

Machine Learning · Computer Science 2021-02-15 Tom Veniat , Ludovic Denoyer , Marc'Aurelio Ranzato

Memory-augmented LLM agents offer an appealing shortcut to continual learning: rather than updating model parameters, they accumulate experience in external memory, seemingly sidestepping the stability-plasticity dilemma of parametric…

Machine Learning · Computer Science 2026-05-01 Qisheng Hu , Quanyu Long , Wenya Wang