Related papers: SIMDRAM: A Framework for Bit-Serial SIMD Processin…

SIMDRAM: An End-to-End Framework for Bit-Serial SIMD Computing in DRAM

Processing-using-DRAM has been proposed for a limited set of basic operations (i.e., logic operations, addition). However, in order to enable full adoption of processing-using-DRAM, it is necessary to provide support for more complex…

Hardware Architecture · Computer Science 2021-07-01 Nastaran Hajinazar , Geraldo F. Oliveira , Sven Gregorio , João Ferreira , Nika Mansouri Ghiasi , Minesh Patel , Mohammed Alser , Saugata Ghose , Juan Gómez Luna , Onur Mutlu

MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Processing

Processing-using-DRAM (PUD) is a processing-in-memory (PIM) approach that uses a DRAM array's massive internal parallelism to execute very-wide data-parallel operations, in a single-instruction multiple-data (SIMD) fashion. However, DRAM…

Hardware Architecture · Computer Science 2024-03-05 Geraldo F. Oliveira , Ataberk Olgun , Abdullah Giray Yağlıkçı , F. Nisa Bostancı , Juan Gómez-Luna , Saugata Ghose , Onur Mutlu

Membrane: Accelerating Database Analytics with Bank-Level DRAM-PIM Filtering

In-memory database query processing frequently involves substantial data transfers between the CPU and memory, leading to inefficiencies due to Von Neumann bottleneck. Processing-in-Memory (PIM) architectures offer a viable solution to…

Hardware Architecture · Computer Science 2025-04-10 Akhil Shekar , Kevin Gaffney , Martin Prammer , Khyati Kiyawat , Lingxi Wu , Helena Caminal , Zhenxing Fan , Yimin Gao , Ashish Venkat , José F. Martínez , Jignesh Patel , Kevin Skadron

Sectored DRAM: A Practical Energy-Efficient and High-Performance Fine-Grained DRAM Architecture

We propose Sectored DRAM, a new, low-overhead DRAM substrate that reduces wasted energy by enabling fine-grained DRAM data transfers and DRAM row activation. Sectored DRAM leverages two key ideas to enable fine-grained data transfers and…

Hardware Architecture · Computer Science 2024-06-11 Ataberk Olgun , F. Nisa Bostanci , Geraldo F. Oliveira , Yahya Can Tugrul , Rahul Bera , A. Giray Yaglikci , Hasan Hassan , Oguz Ergin , Onur Mutlu

Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions

Poor DRAM technology scaling over the course of many years has caused DRAM-based main memory to increasingly become a larger system bottleneck. A major reason for the bottleneck is that data stored within DRAM must be moved across a…

Hardware Architecture · Computer Science 2018-02-02 Saugata Ghose , Kevin Hsieh , Amirali Boroumand , Rachata Ausavarungnirun , Onur Mutlu

PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems

Processing-in-memory (PIM) has emerged as a promising solution for accelerating memory-intensive workloads as they provide high memory bandwidth to the processing units. This approach has drawn attention not only from the academic community…

Hardware Architecture · Computer Science 2024-09-11 Dongjae Lee , Bongjoon Hyun , Taehun Kim , Minsoo Rhu

SIMD-X: Programming and Processing of Graph Algorithms on GPUs

With high computation power and memory bandwidth, graphics processing units (GPUs) lend themselves to accelerate data-intensive analytics, especially when such applications fit the single instruction multiple data (SIMD) model. However,…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-12-12 Hang Liu , H. Howie Huang

Accelerating Bulk Bit-Wise X(N)OR Operation in Processing-in-DRAM Platform

With Von-Neumann computing architectures struggling to address computationally- and memory-intensive big data analytic task today, Processing-in-Memory (PIM) platforms are gaining growing interests. In this way, processing-in-DRAM…

Hardware Architecture · Computer Science 2019-04-12 Shaahin Angizi , Deliang Fan

PIMSAB: A Processing-In-Memory System with Spatially-Aware Communication and Bit-Serial-Aware Computation

Bit-serial Processing-In-Memory (PIM) is an attractive paradigm for accelerator architectures, for parallel workloads such as Deep Learning (DL), because of its capability to achieve massive data parallelism at a low area overhead and…

Hardware Architecture · Computer Science 2023-11-21 Aman Arora , Jian Weng , Siyuan Ma , Tony Nowatzki , Lizy K. John

PiDRAM: An FPGA-based Framework for End-to-end Evaluation of Processing-in-DRAM Techniques

DRAM-based main memory is used in nearly all computing systems as a major component. One way of overcoming the main memory bottleneck is to move computation near memory, a paradigm known as processing-in-memory (PiM). Recent PiM techniques…

Hardware Architecture · Computer Science 2022-06-02 Ataberk Olgun , Juan Gomez Luna , Konstantinos Kanellopoulos , Behzad Salami , Hasan Hassan , Oguz Ergin , Onur Mutlu

Data-Centric and Data-Aware Frameworks for Fundamentally Efficient Data Handling in Modern Computing Systems

There is an explosive growth in the size of the input and/or intermediate data used and generated by modern and emerging applications. Unfortunately, modern computing systems are not capable of handling large amounts of data efficiently.…

Hardware Architecture · Computer Science 2021-09-14 Nastaran Hajinazar

Concurrent Processing Memory

A theoretical memory with limited processing power and internal connectivity at each element is proposed. This memory carries out parallel processing within itself to solve generic array problems. The applicability of this in-memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-09-28 Chengpu Wang

SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory

Data movement between memory and processors is a major bottleneck in modern computing systems. The processing-in-memory (PIM) paradigm aims to alleviate this bottleneck by performing computation inside memory chips. Real PIM hardware (e.g.,…

Hardware Architecture · Computer Science 2023-10-04 Jinfan Chen , Juan Gómez-Luna , Izzat El Hajj , Yuxin Guo , Onur Mutlu

Shared-PIM: Enabling Concurrent Computation and Data Flow for Faster Processing-in-DRAM

Processing-in-Memory (PIM) enhances memory with computational capabilities, potentially solving energy and latency issues associated with data transfer between memory and processors. However, managing concurrent computation and data flow…

Hardware Architecture · Computer Science 2025-05-09 Ahmed Mamdouh , Haoran Geng , Michael Niemier , Xiaobo Sharon Hu , Dayane Reis

Neural-PIM: Efficient Processing-In-Memory with Neural Approximation of Peripherals

Processing-in-memory (PIM) architectures have demonstrated great potential in accelerating numerous deep learning tasks. Particularly, resistive random-access memory (RRAM) devices provide a promising hardware substrate to build PIM…

Hardware Architecture · Computer Science 2022-02-01 Weidong Cao , Yilong Zhao , Adith Boloor , Yinhe Han , Xuan Zhang , Li Jiang

Memory-Centric Computing: Recent Advances in Processing-in-DRAM

Memory-centric computing aims to enable computation capability in and near all places where data is generated and stored. As such, it can greatly reduce the large negative performance and energy impact of data access and data movement, by…

Hardware Architecture · Computer Science 2024-12-30 Onur Mutlu , Ataberk Olgun , Geraldo F. Oliveira , Ismail Emir Yuksel

A Modern Primer on Processing in Memory

This paper discusses recent research that aims to enable computation close to data, an approach we broadly call processing-in-memory (PIM). PIM places computation mechanisms in or near where the data is stored (i.e., inside memory chips or…

Hardware Architecture · Computer Science 2025-02-07 Onur Mutlu , Saugata Ghose , Juan Gómez-Luna , Rachata Ausavarungnirun , Mohammad Sadrosadati , Geraldo F. Oliveira

Darwin: A DRAM-based Multi-level Processing-in-Memory Architecture for Data Analytics

Processing-in-memory (PIM) architecture is an inherent match for data analytics application, but we observe major challenges to address when accelerating it using PIM. In this paper, we propose Darwin, a practical LRDIMM-based multi-level…

Systems and Control · Electrical Eng. & Systems 2025-09-23 Donghyuk Kim , Jae-Young Kim , Wontak Han , Jongsoon Won , Haerang Choi , Yongkee Kwon , Joo-Young Kim

CRAM: Efficient Hardware-Based Memory Compression for Bandwidth Enhancement

This paper investigates hardware-based memory compression designs to increase the memory bandwidth. When lines are compressible, the hardware can store multiple lines in a single memory location, and retrieve all these lines in a single…

Hardware Architecture · Computer Science 2018-07-23 Vinson Young , Sanjay Kariyappa , Moinuddin K. Qureshi

DARTH-PUM: A Hybrid Processing-Using-Memory Architecture

Analog processing-using-memory (PUM; a.k.a. in-memory computing) makes use of electrical interactions inside memory arrays to perform bulk matrix-vector multiplication (MVM) operations. However, many popular matrix-based kernels need to…

Hardware Architecture · Computer Science 2026-05-06 Ryan Wong , Ben Feinberg , Saugata Ghose