Related papers: Understanding Memory Modules on Learning Simple Al…

Structure Learning for Neural Module Networks

Neural Module Networks, originally proposed for the task of visual question answering, are a class of neural network architectures that involve human-specified neural modules, each designed for a specific form of reasoning. In current…

Machine Learning · Computer Science 2019-11-11 Vardaan Pahuja , Jie Fu , Sarath Chandar , Christopher J. Pal

Breaking Neural Network Scaling Laws with Modularity

Modular neural networks outperform nonmodular neural networks on tasks ranging from visual question answering to robotics. These performance improvements are thought to be due to modular networks' superior ability to model the compositional…

Machine Learning · Computer Science 2025-03-12 Akhilan Boopathy , Sunshine Jiang , William Yue , Jaedong Hwang , Abhiram Iyer , Ila Fiete

Neural Turing Machines

We extend the capabilities of neural networks by coupling them to external memory resources, which they can interact with by attentional processes. The combined system is analogous to a Turing Machine or Von Neumann architecture but is…

Neural and Evolutionary Computing · Computer Science 2014-12-11 Alex Graves , Greg Wayne , Ivo Danihelka

Learning Numeracy: Binary Arithmetic with Neural Turing Machines

One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by…

Neural and Evolutionary Computing · Computer Science 2024-12-20 Jacopo Castellini

Modular Networks: Learning to Decompose Neural Computation

Scaling model capacity has been vital in the success of deep learning. For a typical network, necessary compute resources and training time grow dramatically with model size. Conditional computation is a promising way to increase the number…

Machine Learning · Computer Science 2018-11-14 Louis Kirsch , Julius Kunze , David Barber

Improving Systematic Generalization Through Modularity and Augmentation

Systematic generalization is the ability to combine known parts into novel meaning; an important aspect of efficient human learning, but a weakness of neural network learning. In this work, we investigate how two well-known modeling…

Artificial Intelligence · Computer Science 2022-02-23 Laura Ruis , Brenden Lake

One-shot Learning with Memory-Augmented Neural Networks

Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through…

Machine Learning · Computer Science 2016-05-20 Adam Santoro , Sergey Bartunov , Matthew Botvinick , Daan Wierstra , Timothy Lillicrap

Modular meta-learning

Many prediction problems, such as those that arise in the context of robotics, have a simplifying underlying structure that, if known, could accelerate learning. In this paper, we present a strategy for learning a set of neural network…

Machine Learning · Computer Science 2019-05-06 Ferran Alet , Tomás Lozano-Pérez , Leslie P. Kaelbling

On The Specialization of Neural Modules

A number of machine learning models have been proposed with the goal of achieving systematic generalization: the ability to reason about new situations by combining aspects of previous experiences. These models leverage compositional…

Machine Learning · Computer Science 2024-09-24 Devon Jarvis , Richard Klein , Benjamin Rosman , Andrew M. Saxe

Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling

Reasoning is a core capability of large language models, yet how multi-step reasoning is learned and executed remains unclear. We study this question in a controlled cellular-automata (1dCA) framework that excludes memorisation by using…

Machine Learning · Computer Science 2026-05-08 Ivan Rodkin , Daniil Orel , Konstantin Smirnov , Arman Bolatov , Bilal Elbouardi , Besher Hassan , Yuri Kuratov , Aydar Bulatov , Preslav Nakov , Timothy Baldwin , Artem Shelmanov , Mikhail Burtsev

Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory

The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be…

Machine Learning · Computer Science 2020-07-01 Antonio Carta , Alessandro Sperduti , Davide Bacciu

Structured Memory for Neural Turing Machines

Neural Turing Machines (NTM) contain memory component that simulates "working memory" in the brain to store and retrieve information to ease simple algorithms learning. So far, only linearly organized memory is proposed, and during…

Artificial Intelligence · Computer Science 2015-10-27 Wei Zhang , Yang Yu , Bowen Zhou

Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages

We introduce three memory-augmented Recurrent Neural Networks (MARNNs) and explore their capabilities on a series of simple language modeling tasks whose solutions require stack-based mechanisms. We provide the first demonstration of neural…

Computation and Language · Computer Science 2019-11-11 Mirac Suzgun , Sebastian Gehrmann , Yonatan Belinkov , Stuart M. Shieber

Concept Learning through Deep Reinforcement Learning with Memory-Augmented Neural Networks

Deep neural networks have shown superior performance in many regimes to remember familiar patterns with large amounts of data. However, the standard supervised deep learning paradigm is still limited when facing the need to learn new…

Machine Learning · Computer Science 2018-11-16 Jing Shi , Jiaming Xu , Yiqun Yao , Bo Xu

Neural Execution Engines: Learning to Execute Subroutines

A significant effort has been made to train neural networks that replicate algorithmic reasoning, but they often fail to learn the abstract concepts underlying these algorithms. This is evidenced by their inability to generalize to data…

Machine Learning · Computer Science 2020-10-26 Yujun Yan , Kevin Swersky , Danai Koutra , Parthasarathy Ranganathan , Milad Hashemi

ResMem: Learn what you can and memorize the rest

The impressive generalization performance of modern neural networks is attributed in part to their ability to implicitly memorize complex training patterns. Inspired by this, we explore a novel mechanism to improve model generalization via…

Machine Learning · Computer Science 2023-10-24 Zitong Yang , Michal Lukasik , Vaishnavh Nagarajan , Zonglin Li , Ankit Singh Rawat , Manzil Zaheer , Aditya Krishna Menon , Sanjiv Kumar

Memory-Modular Classification: Learning to Generalize with Memory Replacement

We propose a novel memory-modular learner for image classification that separates knowledge memorization from reasoning. Our model enables effective generalization to new classes by simply replacing the memory contents, without the need for…

Computer Vision and Pattern Recognition · Computer Science 2025-04-09 Dahyun Kang , Ahmet Iscen , Eunchan Jo , Sua Choi , Minsu Cho , Cordelia Schmid

A Combinatorial Perspective on Transfer Learning

Human intelligence is characterized not only by the capacity to learn complex skills, but the ability to rapidly adapt and acquire new skills within an ever-changing environment. In this work we study how the learning of modular solutions…

Machine Learning · Computer Science 2020-10-26 Jianan Wang , Eren Sezener , David Budden , Marcus Hutter , Joel Veness

Memory and attention in deep learning

Intelligence necessitates memory. Without memory, humans fail to perform various nontrivial tasks such as reading novels, playing games or solving maths. As the ultimate goal of machine learning is to derive intelligent systems that learn…

Machine Learning · Computer Science 2021-07-06 Hung Le

Tell Me What To Learn: Generalizing Neural Memory to be Controllable in Natural Language

Modern machine learning models are deployed in diverse, non-stationary environments where they must continually adapt to new tasks and evolving knowledge. Continual fine-tuning and in-context learning are costly and brittle, whereas neural…

Machine Learning · Computer Science 2026-03-04 Max S. Bennett , Thomas P. Zollo , Richard Zemel