Related papers: COMPASS: A Compiler Framework for Resource-Constra…

Breaking Barriers: Maximizing Array Utilization for Compute In-Memory Fabrics

Compute in-memory (CIM) is a promising technique that minimizes data transport, the primary performance bottleneck and energy cost of most data intensive applications. This has found wide-spread adoption in accelerating neural networks for…

Hardware Architecture · Computer Science 2020-08-18 Brian Crafton , Samuel Spetalnick , Gauthaman Murali , Tushar Krishna , Sung-Kyu Lim , Arijit Raychowdhury

RNC: Efficient RRAM-aware NAS and Compilation for DNNs on Resource-Constrained Edge Devices

Computing-in-memory (CIM) is an emerging computing paradigm, offering noteworthy potential for accelerating neural networks with high parallelism, low latency, and energy efficiency compared to conventional von Neumann architectures.…

Neural and Evolutionary Computing · Computer Science 2024-09-30 Kam Chi Loong , Shihao Han , Sishuo Liu , Ning Lin , Zhongrui Wang

MARS: Multi-macro Architecture SRAM CIM-Based Accelerator with Co-designed Compressed Neural Networks

Convolutional neural networks (CNNs) play a key role in deep learning applications. However, the large storage overheads and the substantial computation cost of CNNs are problematic in hardware accelerators. Computing-in-memory (CIM)…

Hardware Architecture · Computer Science 2021-05-26 Syuan-Hao Sie , Jye-Luen Lee , Yi-Ren Chen , Chih-Cheng Lu , Chih-Cheng Hsieh , Meng-Fan Chang , Kea-Tiong Tang

PIMCOMP: An End-to-End DNN Compiler for Processing-In-Memory Accelerators

Various processing-in-memory (PIM) accelerators based on various devices, micro-architectures, and interfaces have been proposed to accelerate deep neural networks (DNNs). How to deploy DNNs onto PIM-based accelerators is the key to explore…

Hardware Architecture · Computer Science 2024-11-15 Xiaotian Sun , Xinyu Wang , Wanqian Li , Yinhe Han , Xiaoming Chen

PIMCOMP: A Universal Compilation Framework for Crossbar-based PIM DNN Accelerators

Crossbar-based PIM DNN accelerators can provide massively parallel in-situ operations. A specifically designed compiler is important to achieve high performance for a wide variety of DNN workloads. However, some key compilation issues such…

Hardware Architecture · Computer Science 2023-07-06 Xiaotian Sun , Xinyu Wang , Wanqian Li , Lei Wang , Yinhe Han , Xiaoming Chen

ARAS: An Adaptive Low-Cost ReRAM-Based Accelerator for DNNs

Processing Using Memory (PUM) accelerators have the potential to perform Deep Neural Network (DNN) inference by using arrays of memory cells as computation engines. Among various memory technologies, ReRAM crossbars show promising…

Hardware Architecture · Computer Science 2024-10-24 Mohammad Sabri , Marc Riera , Antonio González

Mixed-Precision Training and Compilation for RRAM-based Computing-in-Memory Accelerators

Computing-in-Memory (CIM) accelerators are a promising solution for accelerating Machine Learning (ML) workloads, as they perform Matrix-Vector Multiplications (MVMs) on crossbar arrays directly in memory. Although the bit widths of the…

Machine Learning · Computer Science 2026-03-20 Rebecca Pelke , Joel Klein , Jose Cubero-Cascante , Nils Bosbach , Jan Moritz Joseph , Rainer Leupers

Fast-OverlaPIM: A Fast Overlap-driven Mapping Framework for Processing In-Memory Neural Network Acceleration

Processing in-memory (PIM) is promising to accelerate neural networks (NNs) because it minimizes data movement and provides large computational parallelism. Similar to machine learning accelerators, application mapping, which determines the…

Hardware Architecture · Computer Science 2024-07-02 Xuan Wang , Minxuan Zhou , Tajana Rosing

BasisN: Reprogramming-Free RRAM-Based In-Memory-Computing by Basis Combination for Deep Neural Networks

Deep neural networks (DNNs) have made breakthroughs in various fields including image recognition and language processing. DNNs execute hundreds of millions of multiply-and-accumulate (MAC) operations. To efficiently accelerate such…

Systems and Control · Electrical Eng. & Systems 2024-07-08 Amro Eldebiky , Grace Li Zhang , Xunzhao Yin , Cheng Zhuo , Ing-Chao Lin , Ulf Schlichtmann , Bing Li

High Area/Energy Efficiency RRAM CNN Accelerator with Kernel-Reordering Weight Mapping Scheme Based on Pattern Pruning

Resistive Random Access Memory (RRAM) is an emerging device for processing-in-memory (PIM) architecture to accelerate convolutional neural network (CNN). However, due to the highly coupled crossbar structure in the RRAM array, it is…

Hardware Architecture · Computer Science 2020-10-14 Songming Yu , Yongpan Liu , Lu Zhang , Jingyu Wang , Jinshan Yue , Zhuqing Yuan , Xueqing Li , Huazhong Yang

Bayesian Optimization of Crossbar-Based Compute-In-Memory System Design for Efficient DNN Inference

Leveraging the high density and energy efficiency of Compute-In-Memory (CIM) crossbar-based Deep Neural Network (DNN) accelerators requires optimal Design Space Exploration (DSE), which becomes increasingly challenging as complex models for…

Emerging Technologies · Computer Science 2026-05-12 Arnob Saha , Bibhas Manna , Nikhil Kotikalapudi , Md Zesun Ahmed Mia , Rahul Kumar , Madhavan Swaminathan , Abhronil Sengupta

Be CIM or Be Memory: A Dual-mode-aware DNN Compiler for CIM Accelerators

Computing-in-memory (CIM) architectures demonstrate superior performance over traditional architectures. To unleash the potential of CIM accelerators, many compilation methods have been proposed, focusing on application scheduling…

Hardware Architecture · Computer Science 2025-02-25 Shixin Zhao , Yuming Li , Bing Li , Yintao He , Mengdi Wang , Yinhe Han , Ying Wang

ReCross: Efficient Embedding Reduction Scheme for In-Memory Computing using ReRAM-Based Crossbar

Deep learning-based recommendation models (DLRMs) are widely deployed in commercial applications to enhance user experience. However, the large and sparse embedding layers in these models impose substantial memory bandwidth bottlenecks due…

Hardware Architecture · Computer Science 2025-09-16 Yu-Hong Lai , Chieh-Lin Tsai , Wen Sheng Lim , Han-Wen Hu , Tei-Wei Kuo , Yuan-Hao Chang

Counting Cards: Exploiting Variance and Data Distributions for Robust Compute In-Memory

Compute in-memory (CIM) is a promising technique that minimizes data transport, the primary performance bottleneck and energy cost of most data intensive applications. This has found wide-spread adoption in accelerating neural networks for…

Signal Processing · Electrical Eng. & Systems 2021-02-16 Brian Crafton , Samuel Spetalnick , Arijit Raychowdhury

RAPIDNN: In-Memory Deep Neural Network Acceleration Framework

Deep neural networks (DNN) have demonstrated effectiveness for various applications such as image processing, video segmentation, and speech recognition. Running state-of-the-art DNNs on current systems mostly relies on either…

Neural and Evolutionary Computing · Computer Science 2019-04-15 Mohsen Imani , Mohammad Samragh , Yeseong Kim , Saransh Gupta , Farinaz Koushanfar , Tajana Rosing

A SOT-MRAM-based Processing-In-Memory Engine for Highly Compressed DNN Implementation

The computing wall and data movement challenges of deep neural networks (DNNs) have exposed the limitations of conventional CMOS-based DNN accelerators. Furthermore, the deep structure and large model size will make DNNs prohibitive to…

Signal Processing · Electrical Eng. & Systems 2019-12-12 Geng Yuan , Xiaolong Ma , Sheng Lin , Zhengang Li , Caiwen Ding

CIMinus: Empowering Sparse DNN Workloads Modeling and Exploration on SRAM-based CIM Architectures

Compute-in-memory (CIM) has emerged as a pivotal direction for accelerating workloads in the field of machine learning, such as Deep Neural Networks (DNNs). However, the effective exploitation of sparsity in CIM systems presents numerous…

Hardware Architecture · Computer Science 2025-11-21 Yingjie Qi , Jianlei Yang , Rubing Yang , Cenlin Duan , Xiaolin He , Ziyan He , Weitao Pan , Weisheng Zhao

MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems

Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly. As a promising solution achieving high scalability and low manufacturing cost, multi-accelerator systems widely exist in data centers,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-07-25 Guan Shen , Jieru Zhao , Zeke Wang , Zhe Lin , Wenchao Ding , Chentao Wu , Quan Chen , Minyi Guo

Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression

Existing deep convolutional neural networks (CNNs) generate massive interlayer feature data during network inference. To maintain real-time processing in embedded systems, large on-chip memory is required to buffer the interlayer feature…

Hardware Architecture · Computer Science 2021-10-13 Zhuang Shao , Xiaoliang Chen , Li Du , Lei Chen , Yuan Du , Wei Zhuang , Huadong Wei , Chenjia Xie , Zhongfeng Wang

Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation

The state-of-art DNN structures involve intensive computation and high memory storage. To mitigate the challenges, the memristor crossbar array has emerged as an intrinsically suitable matrix computation and low-power acceleration framework…

Signal Processing · Electrical Eng. & Systems 2019-09-04 Xiaolong Ma , Geng Yuan , Sheng Lin , Caiwen Ding , Fuxun Yu , Tao Liu , Wujie Wen , Xiang Chen , Yanzhi Wang