Related papers: Learning Memory Access Patterns

A neural network memory prefetcher using semantic locality

Accurate memory prefetching is paramount for processor performance, and modern processors employ various techniques to identify and prefetch different memory access patterns. While most modern prefetchers target spatio-temporal patterns by…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-14 Leeor Peled , Uri Weiser , Yoav Etsion

A Survey of Near-Data Processing Architectures for Neural Networks

Data-intensive workloads and applications, such as machine learning (ML), are fundamentally limited by traditional computing systems based on the von-Neumann architecture. As data movement operations and energy consumption become key…

Hardware Architecture · Computer Science 2021-12-24 Mehdi Hassanpour , Marc Riera , Antonio González

Robust High-dimensional Memory-augmented Neural Networks

Traditional neural networks require enormous amounts of data to build their complex mappings during a slow training procedure that hinders their abilities for relearning and adapting to new data. Memory-augmented neural networks enhance…

Emerging Technologies · Computer Science 2021-06-23 Geethan Karunaratne , Manuel Schmuck , Manuel Le Gallo , Giovanni Cherubini , Luca Benini , Abu Sebastian , Abbas Rahimi

On Memory Codelets: Prefetching, Recoding, Moving and Streaming Data

For decades, memory capabilities have scaled up much slower than compute capabilities, leaving memory utilization as a major bottleneck. Prefetching and cache hierarchies mitigate this in applications with easily predictable memory accesses…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-02 Dawson Fox , Jose Monsalve Diaz , Xiaoming Li

Dynamic Computing Random Access Memory

The present von Neumann computing paradigm involves a significant amount of information transfer between a central processing unit (CPU) and memory, with concomitant limitations in the actual execution speed. However, it has been recently…

Emerging Technologies · Computer Science 2014-07-03 Fabio Lorenzo Traversa , Fabrizio Bonani , Yuriy V. Pershin , Massimiliano Di Ventra

Memory-based Parameter Adaptation

Deep neural networks have excelled on a wide range of problems, from vision to language and game playing. Neural networks very gradually incorporate information into weights as they process data, requiring very low learning rates. If the…

Machine Learning · Statistics 2018-03-01 Pablo Sprechmann , Siddhant M. Jayakumar , Jack W. Rae , Alexander Pritzel , Adrià Puigdomènech Badia , Benigno Uria , Oriol Vinyals , Demis Hassabis , Razvan Pascanu , Charles Blundell

Neural Random-Access Machines

In this paper, we propose and investigate a new neural network architecture called Neural Random Access Machine. It can manipulate and dereference pointers to an external variable-size random-access memory. The model is trained from pure…

Machine Learning · Computer Science 2016-02-11 Karol Kurach , Marcin Andrychowicz , Ilya Sutskever

A Co-design view of Compute in-Memory with Non-Volatile Elements for Neural Networks

Deep Learning neural networks are pervasive, but traditional computer architectures are reaching the limits of being able to efficiently execute them for the large workloads of today. They are limited by the von Neumann bottleneck: the high…

Emerging Technologies · Computer Science 2022-06-22 Wilfried Haensch , Anand Raghunathan , Kaushik Roy , Bhaswar Chakrabarti , Charudatta M. Phatak , Cheng Wang , Supratik Guha

Memory and attention in deep learning

Intelligence necessitates memory. Without memory, humans fail to perform various nontrivial tasks such as reading novels, playing games or solving maths. As the ultimate goal of machine learning is to derive intelligent systems that learn…

Machine Learning · Computer Science 2021-07-06 Hung Le

Deep Learning based Data Prefetching in CPU-GPU Unified Virtual Memory

Unified Virtual Memory (UVM) relieves the developers from the onus of maintaining complex data structures and explicit data migration by enabling on-demand data movement between CPU memory and GPU memory. However, on-demand paging soon…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-11 Xinjian Long , Xiangyang Gong , Huiyang Zhou

Mitigating the Memory Bottleneck with Machine Learning-Driven and Data-Aware Microarchitectural Techniques

Modern applications process massive data volumes that overwhelm the storage and retrieval capabilities of memory systems, making memory the primary performance and energy-efficiency bottleneck of computing systems. Although many…

Hardware Architecture · Computer Science 2026-03-10 Rahul Bera

An Overview of In-memory Processing with Emerging Non-volatile Memory for Data-intensive Applications

The conventional von Neumann architecture has been revealed as a major performance and energy bottleneck for rising data-intensive applications. %, due to the intensive data movements. The decade-old idea of leveraging in-memory processing…

Hardware Architecture · Computer Science 2019-06-18 Bing Li , Bonan Yan , Hai , Li

One-shot Learning with Memory-Augmented Neural Networks

Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through…

Machine Learning · Computer Science 2016-05-20 Adam Santoro , Sergey Bartunov , Matthew Botvinick , Daan Wierstra , Timothy Lillicrap

A Memory Hierarchical Layer Assigning and Prefetching Technique to Overcome the Memory Performance/Energy Bottleneck

The memory subsystem has always been a bottleneck in performance as well as significant power contributor in memory intensive applications. Many researchers have presented multi-layered memory hierarchies as a means to design energy and…

Hardware Architecture · Computer Science 2011-11-09 Minas Dasygenis , Erik Brockmeyer , Bart Durinck , Francky Catthoor , Dimitrios Soudris , Antonios Thanailakis

Prefetcher-based DRAM Architecture

Advancement in Processor technology has made it easy to handle data-intensive workloads, but limiting main memory advances has created performance bottlenecks. In DRAM, there have been improvements in DRAM access latency as well as…

Hardware Architecture · Computer Science 2021-05-24 Saurabh Jaiswal , Shailendra Kumar Gupta , Soumya Soubhagya Dandapat

Learning Before Filtering: Real-Time Hardware Learning at the Detector Level

Advances in sensor technology and automation have ushered in an era of data abundance, where the ability to identify and extract relevant information in real time has become increasingly critical. Traditional filtering approaches, which…

High Energy Physics - Experiment · Physics 2025-07-29 Boštjan Maček

Memory Wall is not gone: A Critical Outlook on Memory Architecture in Digital Neuromorphic Computing

The rapid advancement of neuromorphic technology aims to address the memory wall challenge inherent in conventional von Neumann architectures. This paper critically examines current digital neuromorphic processors and their strategies to…

Hardware Architecture · Computer Science 2026-04-13 Amirreza Yousefzadeh , Sameed Sohail , Ana Lucia Varbanescu

Interleaver Design for Deep Neural Networks

We propose a class of interleavers for a novel deep neural network (DNN) architecture that uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational requirements, and speed up training. The…

Machine Learning · Computer Science 2019-04-29 Sourya Dey , Peter A. Beerel , Keith M. Chugg

Hardware Acceleration for Neural Networks: A Comprehensive Survey

Neural networks have become dominant computational workloads across cloud and edge platforms, but their rapid growth in model size and deployment diversity has exposed hardware bottlenecks increasingly dominated by memory movement,…

Systems and Control · Electrical Eng. & Systems 2026-01-16 Bin Xu , Ayan Banerjee , Sandeep Gupta

Data Cache Prefetching with Perceptron Learning

Cache prefetcher greatly eliminates compulsory cache misses, by fetching data from slower memory to faster cache before it is actually required by processors. Sophisticated prefetchers predict next use cache line by repeating program's…

Hardware Architecture · Computer Science 2017-12-05 Haoyuan Wang , Zhiwei Luo