Related papers: EPIM: Efficient Processing-In-Memory Accelerators …

ConvPIM: Evaluating Digital Processing-in-Memory through Convolutional Neural Network Acceleration

Processing-in-memory (PIM) architectures are emerging to reduce data movement in data-intensive applications. These architectures seek to exploit the same physical devices for both information storage and logic, thereby dwarfing the…

Hardware Architecture · Computer Science 2023-05-09 Orian Leitersdorf , Ronny Ronen , Shahar Kvatinsky

Empowering Malware Detection Efficiency within Processing-in-Memory Architecture

The widespread integration of embedded systems across various industries has facilitated seamless connectivity among devices and bolstered computational capabilities. Despite their extensive applications, embedded systems encounter…

Cryptography and Security · Computer Science 2024-04-16 Sreenitha Kasarapu , Sathwika Bavikadi , Sai Manoj Pudukotai Dinakarrao

Neural-PIM: Efficient Processing-In-Memory with Neural Approximation of Peripherals

Processing-in-memory (PIM) architectures have demonstrated great potential in accelerating numerous deep learning tasks. Particularly, resistive random-access memory (RRAM) devices provide a promising hardware substrate to build PIM…

Hardware Architecture · Computer Science 2022-02-01 Weidong Cao , Yilong Zhao , Adith Boloor , Yinhe Han , Xuan Zhang , Li Jiang

PIMCOMP: An End-to-End DNN Compiler for Processing-In-Memory Accelerators

Various processing-in-memory (PIM) accelerators based on various devices, micro-architectures, and interfaces have been proposed to accelerate deep neural networks (DNNs). How to deploy DNNs onto PIM-based accelerators is the key to explore…

Hardware Architecture · Computer Science 2024-11-15 Xiaotian Sun , Xinyu Wang , Wanqian Li , Yinhe Han , Xiaoming Chen

An Event-Based Digital Compute-In-Memory Accelerator with Flexible Operand Resolution and Layer-Wise Weight/Output Stationarity

Compute-in-memory (CIM) accelerators for spiking neural networks (SNNs) are promising solutions to enable $\mu$s-level inference latency and ultra-low energy in edge vision applications. Yet, their current lack of flexibility at both the…

Hardware Architecture · Computer Science 2024-10-31 Nicolas Chauvaux , Adrian Kneip , Christoph Posch , Kofi Makinwa , Charlotte Frenkel

Optimizing and Exploring System Performance in Compact Processing-in-Memory-based Chips

Processing-in-memory (PIM) is a promising computing paradigm to tackle the "memory wall" challenge. However, PIM system-level benefits over traditional von Neumann architecture can be reduced when the memory array cannot fully store all the…

Hardware Architecture · Computer Science 2025-03-03 Peilin Chen , Xiaoxuan Yang

SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory

Data movement between memory and processors is a major bottleneck in modern computing systems. The processing-in-memory (PIM) paradigm aims to alleviate this bottleneck by performing computation inside memory chips. Real PIM hardware (e.g.,…

Hardware Architecture · Computer Science 2023-10-04 Jinfan Chen , Juan Gómez-Luna , Izzat El Hajj , Yuxin Guo , Onur Mutlu

Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud

Neural networks (NNs) are growing in importance and complexity. A neural network's performance (and energy efficiency) can be bound either by computation or memory resources. The processing-in-memory (PIM) paradigm, where computation is…

Hardware Architecture · Computer Science 2023-03-28 Geraldo F. Oliveira , Juan Gómez-Luna , Saugata Ghose , Amirali Boroumand , Onur Mutlu

A SOT-MRAM-based Processing-In-Memory Engine for Highly Compressed DNN Implementation

The computing wall and data movement challenges of deep neural networks (DNNs) have exposed the limitations of conventional CMOS-based DNN accelerators. Furthermore, the deep structure and large model size will make DNNs prohibitive to…

Signal Processing · Electrical Eng. & Systems 2019-12-12 Geng Yuan , Xiaolong Ma , Sheng Lin , Zhengang Li , Caiwen Ding

PSCNN: A 885.86 TOPS/W Programmable SRAM-based Computing-In-Memory Processor for Keyword Spotting

Computing-in-memory (CIM) has attracted significant attentions in recent years due to its massive parallelism and low power consumption. However, current CIM designs suffer from large area overhead of small CIM macros and bad programmablity…

Hardware Architecture · Computer Science 2022-05-04 Shu-Hung Kuo , Tian-Sheuan Chang

Enabling Highly Efficient Capsule Networks Processing Through A PIM-Based Architecture Design

In recent years, the CNNs have achieved great successes in the image processing tasks, e.g., image recognition and object detection. Unfortunately, traditional CNN's classification is found to be easily misled by increasingly complex image…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-12 Xingyao Zhang , Shuaiwen Leon Song , Chenhao Xie , Jing Wang , Weigong Zhang , Xin Fu

ProactivePIM: Accelerating Weight-Sharing Embedding Layer with PIM for Scalable Recommendation System

Although deep learning-based personalized recommendation systems provide qualified recommendations, they strain data center resources. The main bottleneck is the embedding layer, which is highly memory-intensive due to its sparse, irregular…

Hardware Architecture · Computer Science 2025-11-26 Youngsuk Kim , Junghwan Lim , Hyuk-Jae Lee , Chae Eun Rhee

Fast-OverlaPIM: A Fast Overlap-driven Mapping Framework for Processing In-Memory Neural Network Acceleration

Processing in-memory (PIM) is promising to accelerate neural networks (NNs) because it minimizes data movement and provides large computational parallelism. Similar to machine learning accelerators, application mapping, which determines the…

Hardware Architecture · Computer Science 2024-07-02 Xuan Wang , Minxuan Zhou , Tajana Rosing

Efficient Sparse Processing-in-Memory Architecture (ESPIM) for Machine Learning Inference

Emerging machine learning (ML) models (e.g., transformers) involve memory pin bandwidth-bound matrix-vector (MV) computation in inference. By avoiding pin crossings, processing in memory (PIM) can improve performance and energy for…

Hardware Architecture · Computer Science 2024-04-09 Mingxuan He , Mithuna Thottethodi , T. N. Vijaykumar

P3-LLM: An Integrated NPU-PIM Accelerator for Edge LLM Inference Using Hybrid Numerical Formats

The substantial memory bandwidth and computational demands of large language models (LLMs) present critical challenges for efficient inference. To tackle this, the literature has explored heterogeneous systems that combine neural processing…

Hardware Architecture · Computer Science 2026-05-05 Yuzong Chen , Chao Fang , Xilai Dai , Yuheng Wu , Thierry Tambe , Marian Verhelst , Mohamed S. Abdelfattah

HPIM: Heterogeneous Processing-In-Memory-based Accelerator for Large Language Models Inference

The deployment of large language models (LLMs) presents significant challenges due to their enormous memory footprints, low arithmetic intensity, and stringent latency requirements, particularly during the autoregressive decoding stage.…

Hardware Architecture · Computer Science 2025-11-03 Cenlin Duan , Jianlei Yang , Rubing Yang , Yikun Wang , Yiou Wang , Lingkun Long , Yingjie Qi , Xiaolin He , Ao Zhou , Xueyan Wang , Weisheng Zhao

NAND-SPIN-Based Processing-in-MRAM Architecture for Convolutional Neural Network Acceleration

The performance and efficiency of running large-scale datasets on traditional computing systems exhibit critical bottlenecks due to the existing "power wall" and "memory wall" problems. To resolve those problems, processing-in-memory (PIM)…

Hardware Architecture · Computer Science 2022-04-22 Yinglin Zhao , Jianlei Yang , Bing Li , Xingzhou Cheng , Xucheng Ye , Xueyan Wang , Xiaotao Jia , Zhaohao Wang , Youguang Zhang , Weisheng Zhao

PIMSYN: Synthesizing Processing-in-memory CNN Accelerators

Processing-in-memory architectures have been regarded as a promising solution for CNN acceleration. Existing PIM accelerator designs rely heavily on the experience of experts and require significant manual design overhead. Manual design…

Hardware Architecture · Computer Science 2024-02-29 Wanqian Li , Xiaotian Sun , Xinyu Wang , Lei Wang , Yinhe Han , Xiaoming Chen

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a new convolutional module, efficient spatial pyramid (ESP), which is…

Computer Vision and Pattern Recognition · Computer Science 2018-07-26 Sachin Mehta , Mohammad Rastegari , Anat Caspi , Linda Shapiro , Hannaneh Hajishirzi

NicePIM: Design Space Exploration for Processing-In-Memory DNN Accelerators with 3D-Stacked-DRAM

With the widespread use of deep neural networks(DNNs) in intelligent systems, DNN accelerators with high performance and energy efficiency are greatly demanded. As one of the feasible processing-in-memory(PIM) architectures,…

Hardware Architecture · Computer Science 2023-12-22 Junpeng Wang , Mengke Ge , Bo Ding , Qi Xu , Song Chen , Yi Kang