Related papers: A Workload and Programming Ease Driven Perspective…

Methodologies, Workloads, and Tools for Processing-in-Memory: Enabling the Adoption of Data-Centric Architectures

The increasing prevalence and growing size of data in modern applications have led to high costs for computation in traditional processor-centric computing systems. Moving large volumes of data between memory devices (e.g., DRAM) and…

Hardware Architecture · Computer Science 2022-06-01 Geraldo F. Oliveira , Juan Gómez-Luna , Saugata Ghose , Onur Mutlu

Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-in-Memory Hardware

Many modern workloads such as neural network inference and graph processing are fundamentally memory-bound. For such workloads, data movement between memory and CPU cores imposes a significant overhead in terms of both latency and energy. A…

Hardware Architecture · Computer Science 2023-04-04 Juan Gómez-Luna , Izzat El Hajj , Ivan Fernandez , Christina Giannoula , Geraldo F. Oliveira , Onur Mutlu

Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture

Many modern workloads, such as neural networks, databases, and graph processing, are fundamentally memory-bound. For such workloads, the data movement between main memory and CPU cores imposes a significant overhead in terms of both latency…

Hardware Architecture · Computer Science 2022-05-06 Juan Gómez-Luna , Izzat El Hajj , Ivan Fernandez , Christina Giannoula , Geraldo F. Oliveira , Onur Mutlu

A Modern Primer on Processing in Memory

This paper discusses recent research that aims to enable computation close to data, an approach we broadly call processing-in-memory (PIM). PIM places computation mechanisms in or near where the data is stored (i.e., inside memory chips or…

Hardware Architecture · Computer Science 2025-02-07 Onur Mutlu , Saugata Ghose , Juan Gómez-Luna , Rachata Ausavarungnirun , Mohammad Sadrosadati , Geraldo F. Oliveira

Enabling Practical Processing in and near Memory for Data-Intensive Computing

Modern computing systems suffer from the dichotomy between computation on one side, which is performed only in the processor (and accelerators), and data storage/movement on the other, which all other parts of the system are dedicated to.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-14 Onur Mutlu , Saugata Ghose , Juan Gómez-Luna , Rachata Ausavarungnirun

Modeling and Simulation Frameworks for Processing-in-Memory Architectures

Processing-in-Memory (PIM) has emerged as a promising computing paradigm to address the memory wall and the fundamental bottleneck of the von Neumann architecture by reducing costly data movement between memory and processing units. As with…

Hardware Architecture · Computer Science 2025-12-02 Mahdi Aghaei , Saba Ebrahimi , Mohammad Saleh Arafati , Elham Cheshmikhani , Dara Rahmati , Saeid Gorgin , Jungrae Kim

Processing Data Where It Makes Sense: Enabling In-Memory Computation

Today's systems are overwhelmingly designed to move data to computation. This design choice goes directly against at least three key trends in systems that cause performance, scalability and energy bottlenecks: (1) data access from memory…

Hardware Architecture · Computer Science 2019-03-12 Onur Mutlu , Saugata Ghose , Juan Gómez-Luna , Rachata Ausavarungnirun

Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions

Poor DRAM technology scaling over the course of many years has caused DRAM-based main memory to increasingly become a larger system bottleneck. A major reason for the bottleneck is that data stored within DRAM must be moved across a…

Hardware Architecture · Computer Science 2018-02-02 Saugata Ghose , Kevin Hsieh , Amirali Boroumand , Rachata Ausavarungnirun , Onur Mutlu

PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems

Processing-in-memory (PIM) has emerged as a promising solution for accelerating memory-intensive workloads as they provide high memory bandwidth to the processing units. This approach has drawn attention not only from the academic community…

Hardware Architecture · Computer Science 2024-09-11 Dongjae Lee , Bongjoon Hyun , Taehun Kim , Minsoo Rhu

A$^3$PIM: An Automated, Analytic and Accurate Processing-in-Memory Offloader

The performance gap between memory and processor has grown rapidly. Consequently, the energy and wall-clock time costs associated with moving data between the CPU and main memory predominate the overall computational cost. The…

Hardware Architecture · Computer Science 2024-03-01 Qingcai Jiang , Shaojie Tan , Junshi Chen , Hong An

PIM-STM: Software Transactional Memory for Processing-In-Memory Systems

Processing-In-Memory (PIM) is a novel approach that augments existing DRAM memory chips with lightweight logic. By allowing to offload computations to the PIM system, this architecture allows for circumventing the data-bottleneck problem…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-18 André Lopes , Daniel Castro , Paolo Romano

Heterogeneous Data-Centric Architectures for Modern Data-Intensive Applications: Case Studies in Machine Learning and Databases

Today's computing systems require moving data back-and-forth between computing resources (e.g., CPUs, GPUs, accelerators) and off-chip main memory so that computation can take place on the data. Unfortunately, this data movement is a major…

Hardware Architecture · Computer Science 2022-05-31 Geraldo F. Oliveira , Amirali Boroumand , Saugata Ghose , Juan Gómez-Luna , Onur Mutlu

Machine Learning Training on a Real Processing-in-Memory System

Training machine learning algorithms is a computationally intensive process, which is frequently memory-bound due to repeatedly accessing large training datasets. As a result, processor-centric systems (e.g., CPU, GPU) suffer from costly…

Hardware Architecture · Computer Science 2022-08-04 Juan Gómez-Luna , Yuxin Guo , Sylvan Brocard , Julien Legriel , Remy Cimadomo , Geraldo F. Oliveira , Gagandeep Singh , Onur Mutlu

SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory

Data movement between memory and processors is a major bottleneck in modern computing systems. The processing-in-memory (PIM) paradigm aims to alleviate this bottleneck by performing computation inside memory chips. Real PIM hardware (e.g.,…

Hardware Architecture · Computer Science 2023-10-04 Jinfan Chen , Juan Gómez-Luna , Izzat El Hajj , Yuxin Guo , Onur Mutlu

ALPHA-PIM: Analysis of Linear Algebraic Processing for High-Performance Graph Applications on a Real Processing-In-Memory System

Processing large-scale graph datasets is computationally intensive and time-consuming. Processor-centric CPU and GPU architectures, commonly used for graph applications, often face bottlenecks caused by extensive data movement between the…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-11 Marzieh Barkhordar , Alireza Tabatabaeian , Mohammad Sadrosadati , Christina Giannoula , Juan Gomez Luna , Izzat El Hajj , Onur Mutlu , Alaa R. Alameldeen

Taking Cryptography Out of the Data Path via Near-Memory Processing in DRAM

Cryptographic algorithms such as AES-128 and SHA-256 are fundamental to ensuring data security and integrity. Although these algorithms are computationally efficient, their performance is often constrained by the processor-centric…

Cryptography and Security · Computer Science 2026-05-20 Nicola Barcarolo , Brahmaiah Gandham , Mohammad Sadrosadati , Roberto Passerone , Onur Mutlu , Flavio Vella

Dataflow-Aware PIM-Enabled Manycore Architecture for Deep Learning Workloads

Processing-in-memory (PIM) has emerged as an enabler for the energy-efficient and high-performance acceleration of deep learning (DL) workloads. Resistive random-access memory (ReRAM) is one of the most promising technologies to implement…

Hardware Architecture · Computer Science 2024-03-29 Harsh Sharma , Gaurav Narang , Janardhan Rao Doppa , Umit Ogras , Partha Pratim Pande

A Comparative Study of Digital Memristor-Based Processing-In-Memory from a Device and Reliability Perspective

As data-intensive applications increasingly strain conventional computing systems, processing-in-memory (PIM) has emerged as a promising paradigm to alleviate the memory wall by minimizing data transfer between memory and processing units.…

Emerging Technologies · Computer Science 2026-02-05 Thomas Neuner , Henriette Padberg , Lior Kornblum , Eilam Yalon , Pedram Khalili Amiri , Shahar Kvatinsky

No One-Size-Fits-All: A Workload-Driven Characterization of Bit-Parallel vs. Bit-Serial Data Layouts for Processing-using-Memory

Processing-in-Memory (PIM) is a promising approach to overcoming the memory-wall bottleneck. However, the PIM community has largely treated its two fundamental data layouts, Bit-Parallel (BP) and Bit-Serial (BS), as if they were…

Hardware Architecture · Computer Science 2025-10-01 Jingyao Zhang , Elaheh Sadredini

An Experimental Evaluation of Machine Learning Training on a Real Processing-in-Memory System

Training machine learning (ML) algorithms is a computationally intensive process, which is frequently memory-bound due to repeatedly accessing large training datasets. As a result, processor-centric systems (e.g., CPU, GPU) suffer from…

Hardware Architecture · Computer Science 2023-09-07 Juan Gómez-Luna , Yuxin Guo , Sylvan Brocard , Julien Legriel , Remy Cimadomo , Geraldo F. Oliveira , Gagandeep Singh , Onur Mutlu