Related papers: The Kanerva Machine: A Generative Distributed Memo…

Kanerva++: extending The Kanerva Machine with differentiable, locally block allocated latent memory

Episodic and semantic memory are critical components of the human memory model. The theory of complementary learning systems (McClelland et al., 1995) suggests that the compressed representation produced by a serial event (episodic memory)…

Neural and Evolutionary Computing · Computer Science 2022-02-08 Jason Ramapuram , Yan Wu , Alexandros Kalousis

Product Kanerva Machines: Factorized Bayesian Memory

An ideal cognitively-inspired memory system would compress and organize incoming items. The Kanerva Machine (Wu et al, 2018) is a Bayesian model that naturally implements online memory compression. However, the organization of the Kanerva…

Machine Learning · Computer Science 2020-02-07 Adam Marblestone , Yan Wu , Greg Wayne

A New Training Algorithm for Kanerva's Sparse Distributed Memory

The Sparse Distributed Memory proposed by Pentii Kanerva (SDM in short) was thought to be a model of human long term memory. The architecture of the SDM permits to store binary patterns and to retrieve them using partially matching…

Computer Vision and Pattern Recognition · Computer Science 2012-07-30 Lou Marvin Caraig

Generative Learning of Continuous Data by Tensor Networks

Beyond their origin in modeling many-body quantum systems, tensor networks have emerged as a promising class of models for solving machine learning problems, notably in unsupervised generative learning. While possessing many desirable…

Machine Learning · Computer Science 2024-07-26 Alex Meiburg , Jing Chen , Jacob Miller , Raphaëlle Tihon , Guillaume Rabusseau , Alejandro Perdomo-Ortiz

Learning Attractor Dynamics for Generative Memory

A central challenge faced by memory systems is the robust retrieval of a stored pattern in the presence of interference due to other stored patterns and noise. A theoretically well-founded solution to robust retrieval is given by attractor…

Machine Learning · Computer Science 2018-11-26 Yan Wu , Greg Wayne , Karol Gregor , Timothy Lillicrap

Fast Adaptation in Generative Models with Generative Matching Networks

Despite recent advances, the remaining bottlenecks in deep generative models are necessity of extensive training and difficulties with generalization from small number of training examples. We develop a new generative model called…

Machine Learning · Statistics 2017-09-06 Sergey Bartunov , Dmitry P. Vetrov

Learning to Learn with Generative Models of Neural Network Checkpoints

We explore a data-driven approach for learning to optimize neural networks. We construct a dataset of neural network checkpoints and train a generative model on the parameters. In particular, our model is a conditional diffusion transformer…

Machine Learning · Computer Science 2022-09-27 William Peebles , Ilija Radosavovic , Tim Brooks , Alexei A. Efros , Jitendra Malik

Continual Learning with Bayesian Model based on a Fixed Pre-trained Feature Extractor

Deep learning has shown its human-level performance in various applications. However, current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes. This poses a challenge particularly…

Machine Learning · Computer Science 2022-04-29 Yang Yang , Zhiying Cui , Junjie Xu , Changhong Zhong , Wei-Shi Zheng , Ruixuan Wang

Memory Efficient Continual Learning with Transformers

In many real-world scenarios, data to train machine learning models becomes available over time. Unfortunately, these models struggle to continually learn new concepts without forgetting what has been learnt in the past. This phenomenon is…

Computation and Language · Computer Science 2023-01-16 Beyza Ermis , Giovanni Zappella , Martin Wistuba , Aditya Rawal , Cedric Archambeau

Continual Learning in Generative Adversarial Nets

Developments in deep generative models have allowed for tractable learning of high-dimensional data distributions. While the employed learning procedures typically assume that training data is drawn i.i.d. from the distribution of interest,…

Machine Learning · Computer Science 2017-05-24 Ari Seff , Alex Beatson , Daniel Suo , Han Liu

Supervised Generative Reconstruction: An Efficient Way To Flexibly Store and Recognize Patterns

Matching animal-like flexibility in recognition and the ability to quickly incorporate new information remains difficult. Limits are yet to be adequately addressed in neural models and recognition algorithms. This work proposes a…

Computer Vision and Pattern Recognition · Computer Science 2012-06-26 Tsvi Achler

Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules

Humans excel at discovering regular structures from limited samples and applying inferred rules to novel settings. We investigate whether modern generative models can similarly learn underlying rules from finite samples and perform…

Machine Learning · Computer Science 2024-11-13 Binxu Wang , Jiaqi Shang , Haim Sompolinsky

Learning to Generate with Memory

Memory units have been widely used to enrich the capabilities of deep networks on capturing long-term dependencies in reasoning and prediction tasks, but little investigation exists on deep generative models (DGMs) which are good at…

Machine Learning · Computer Science 2016-05-31 Chongxuan Li , Jun Zhu , Bo Zhang

Efficient Data-Parallel Continual Learning with Asynchronous Distributed Rehearsal Buffers

Deep learning has emerged as a powerful method for extracting valuable information from large volumes of data. However, when new training data arrives continuously (i.e., is not fully available from the beginning), incremental training…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-06 Thomas Bouvier , Bogdan Nicolae , Hugo Chaugier , Alexandru Costan , Ian Foster , Gabriel Antoniu

Attention Approximates Sparse Distributed Memory

While Attention has come to be an important mechanism in deep learning, there remains limited intuition for why it works so well. Here, we show that Transformer Attention can be closely related under certain data conditions to Kanerva's…

Machine Learning · Computer Science 2022-01-19 Trenton Bricken , Cengiz Pehlevan

Generative Kernel Continual learning

Kernel continual learning by \citet{derakhshani2021kernel} has recently emerged as a strong continual learner due to its non-parametric ability to tackle task interference and catastrophic forgetting. Unfortunately its success comes at the…

Machine Learning · Computer Science 2021-12-28 Mohammad Mahdi Derakhshani , Xiantong Zhen , Ling Shao , Cees G. M. Snoek

Tell Me What To Learn: Generalizing Neural Memory to be Controllable in Natural Language

Modern machine learning models are deployed in diverse, non-stationary environments where they must continually adapt to new tasks and evolving knowledge. Continual fine-tuning and in-context learning are costly and brittle, whereas neural…

Machine Learning · Computer Science 2026-03-04 Max S. Bennett , Thomas P. Zollo , Richard Zemel

Kernel Memory Networks: A Unifying Framework for Memory Modeling

We consider the problem of training a neural network to store a set of patterns with maximal noise robustness. A solution, in terms of optimal weights and state update rules, is derived by training each individual neuron to perform either…

Neural and Evolutionary Computing · Computer Science 2024-07-24 Georgios Iatropoulos , Johanni Brea , Wulfram Gerstner

Latent-Variable Generative Models for Data-Efficient Text Classification

Generative classifiers offer potential advantages over their discriminative counterparts, namely in the areas of data efficiency, robustness to data shift and adversarial examples, and zero-shot learning (Ng and Jordan,2002; Yogatama et…

Computation and Language · Computer Science 2019-10-02 Xiaoan Ding , Kevin Gimpel

Adaptive Neural Compilation

This paper proposes an adaptive neural-compilation framework to address the problem of efficient program learning. Traditional code optimisation strategies used in compilers are based on applying pre-specified set of transformations that…

Artificial Intelligence · Computer Science 2016-05-27 Rudy Bunel , Alban Desmaison , Pushmeet Kohli , Philip H. S. Torr , M. Pawan Kumar