Related papers: Differentiable Random Access Memory using Lattices

Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in…

Machine Learning · Computer Science 2016-10-31 Jack W Rae , Jonathan J Hunt , Tim Harley , Ivo Danihelka , Andrew Senior , Greg Wayne , Alex Graves , Timothy P Lillicrap

Neural Random-Access Machines

In this paper, we propose and investigate a new neural network architecture called Neural Random Access Machine. It can manipulate and dereference pointers to an external variable-size random-access memory. The model is trained from pure…

Machine Learning · Computer Science 2016-02-11 Karol Kurach , Marcin Andrychowicz , Ilya Sutskever

Large Memory Layers with Product Keys

This paper introduces a structured memory which can be easily integrated into a neural network. The memory is very large by design and significantly increases the capacity of the architecture, by up to a billion parameters with a negligible…

Computation and Language · Computer Science 2019-12-17 Guillaume Lample , Alexandre Sablayrolles , Marc'Aurelio Ranzato , Ludovic Denoyer , Hervé Jégou

Memory Layers at Scale

Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsely activated memory layers complement compute-heavy dense feed-forward layers, providing dedicated…

Computation and Language · Computer Science 2024-12-23 Vincent-Pierre Berges , Barlas Oğuz , Daniel Haziza , Wen-tau Yih , Luke Zettlemoyer , Gargi Ghosh

Memory-Efficient Differentiable Programming for Quantum Optimal Control of Discrete Lattices

Quantum optimal control problems are typically solved by gradient-based algorithms such as GRAPE, which suffer from exponential growth in storage with increasing number of qubits and linear growth in memory requirements with increasing…

Quantum Physics · Physics 2022-10-18 Xian Wang , Paul Kairys , Sri Hari Krishna Narayanan , Jan Hückelheim , Paul Hovland

RAM-Net: Expressive Linear Attention with Selectively Addressable Memory

While linear attention architectures offer efficient inference, compressing unbounded history into a fixed-size memory inherently limits expressivity and causes information loss. To address this limitation, we introduce Random Access Memory…

Machine Learning · Computer Science 2026-02-13 Kaicheng Xiao , Haotian Li , Liran Dong , Guoliang Xing

Evaluating Persistent Memory Range Indexes: Part Two [Extended Version]

Scalable persistent memory (PM) has opened up new opportunities for building indexes that operate and persist data directly on the memory bus, potentially enabling instant recovery, low latency and high throughput. When real PM hardware…

Databases · Computer Science 2022-07-29 Yuliang He , Duo Lu , Kaisong Huang , Tianzheng Wang

The Future of Memory: Limits and Opportunities

Memory latency, bandwidth, capacity, and energy increasingly limit performance. In this paper, we reconsider proposed system architectures that consist of huge (many-terabyte to petabyte scale) memories shared among large numbers of CPUs.…

Hardware Architecture · Computer Science 2025-09-24 Samuel Dayo , Shuhan Liu , Peijing Li , Philip Levis , Subhasish Mitra , Thierry Tambe , David Tennenhouse , H. -S. Philip Wong

OPTIMUM-DERAM: Highly Consistent, Scalable, and Secure Multi-Object Memory using RLNC

This paper introduces OPTIMUM-DERAM, a highly consistent, scalable, secure, and decentralized shared memory solution. Traditional distributed shared memory implementations offer multi-object support by multi-threading a single object memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-21 Nicolas Nicolaou , Kishori M. Konwar , Moritz Grundei , Aleksandr Bezobchuk , Muriel Médard , Sriram Vishwanath

Lie Access Neural Turing Machine

Following the recent trend in explicit neural memory structures, we present a new design of an external memory, wherein memories are stored in an Euclidean key space $\mathbb R^n$. An LSTM controller performs read and write via specialized…

Neural and Evolutionary Computing · Computer Science 2016-09-07 Greg Yang

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks. However, their substantial computational and memory requirements present challenges, especially for devices…

Computation and Language · Computer Science 2024-08-01 Keivan Alizadeh , Iman Mirzadeh , Dmitry Belenko , Karen Khatamifard , Minsik Cho , Carlo C Del Mundo , Mohammad Rastegari , Mehrdad Farajtabar

Resistive Memory for Computing and Security: Algorithms, Architectures, and Platforms

Resistive random-access memory (RRAM) is gaining popularity due to its ability to offer computing within the memory and its non-volatile nature. The unique properties of RRAM, such as binary switching, multi-state switching, and device…

Emerging Technologies · Computer Science 2024-07-08 Simranjeet Singh , Farhad Merchant , Sachin Patkar

Reservoir Memory Machines as Neural Computers

Differentiable neural computers extend artificial neural networks with an explicit memory without interference, thus enabling the model to perform classic computation tasks such as graph traversal. However, such models are difficult to…

Machine Learning · Computer Science 2022-06-06 Benjamin Paaßen , Alexander Schulz , Terrence C. Stewart , Barbara Hammer

Latency-Aware Differentiable Neural Architecture Search

Differentiable neural architecture search methods became popular in recent years, mainly due to their low search costs and flexibility in designing the search space. However, these methods suffer the difficulty in optimizing network, so…

Computer Vision and Pattern Recognition · Computer Science 2020-03-27 Yuhui Xu , Lingxi Xie , Xiaopeng Zhang , Xin Chen , Bowen Shi , Qi Tian , Hongkai Xiong

Reduction of the Random Access Memory Size in Adjoint Algorithmic Differentiation by Overloading

Adjoint algorithmic differentiation by operator and function overloading is based on the interpretation of directed acyclic graphs resulting from evaluations of numerical simulation programs. The size of the computer system memory required…

Mathematical Software · Computer Science 2022-07-15 Uwe Naumann

Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling

Reasoning is a core capability of large language models, yet how multi-step reasoning is learned and executed remains unclear. We study this question in a controlled cellular-automata (1dCA) framework that excludes memorisation by using…

Machine Learning · Computer Science 2026-05-08 Ivan Rodkin , Daniil Orel , Konstantin Smirnov , Arman Bolatov , Bilal Elbouardi , Besher Hassan , Yuri Kuratov , Aydar Bulatov , Preslav Nakov , Timothy Baldwin , Artem Shelmanov , Mikhail Burtsev

Depth-Gated LSTM

In this short note, we present an extension of long short-term memory (LSTM) neural networks to using a depth gate to connect memory cells of adjacent layers. Doing so introduces a linear dependence between lower and upper layer recurrent…

Neural and Evolutionary Computing · Computer Science 2015-08-26 Kaisheng Yao , Trevor Cohn , Katerina Vylomova , Kevin Duh , Chris Dyer

One-shot Learning with Memory-Augmented Neural Networks

Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through…

Machine Learning · Computer Science 2016-05-20 Adam Santoro , Sergey Bartunov , Matthew Botvinick , Daan Wierstra , Timothy Lillicrap

Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers

Recent studies have demonstrated that the performance of transformers on the task of language modeling obeys a power-law relationship with model size over six orders of magnitude. While transformers exhibit impressive scaling, their…

Machine Learning · Computer Science 2021-10-07 Narsimha Chilkuri , Eric Hunsberger , Aaron Voelker , Gurshaant Malik , Chris Eliasmith

Scaling Laws for Associative Memories

Learning arguably involves the discovery and memorization of abstract rules. The aim of this paper is to study associative memory mechanisms. Our model is based on high-dimensional matrices consisting of outer products of embeddings, which…

Machine Learning · Statistics 2024-02-22 Vivien Cabannes , Elvis Dohmatob , Alberto Bietti