Related papers: CIM-MLC: A Multi-level Compilation Stack for Compu…

CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures

The demand for efficient machine learning (ML) accelerators is growing rapidly, driving the development of novel computing concepts such as resistive random access memory (RRAM)-based tiled computing-in-memory (CIM) architectures. CIM…

Hardware Architecture · Computer Science 2024-01-18 Rebecca Pelke , Jose Cubero-Cascante , Nils Bosbach , Felix Staudigl , Rainer Leupers , Jan Moritz Joseph

Be CIM or Be Memory: A Dual-mode-aware DNN Compiler for CIM Accelerators

Computing-in-memory (CIM) architectures demonstrate superior performance over traditional architectures. To unleash the potential of CIM accelerators, many compilation methods have been proposed, focusing on application scheduling…

Hardware Architecture · Computer Science 2025-02-25 Shixin Zhao , Yuming Li , Bing Li , Yintao He , Mengdi Wang , Yinhe Han , Ying Wang

Mixed-Precision Training and Compilation for RRAM-based Computing-in-Memory Accelerators

Computing-in-Memory (CIM) accelerators are a promising solution for accelerating Machine Learning (ML) workloads, as they perform Matrix-Vector Multiplications (MVMs) on crossbar arrays directly in memory. Although the bit widths of the…

Machine Learning · Computer Science 2026-03-20 Rebecca Pelke , Joel Klein , Jose Cubero-Cascante , Nils Bosbach , Jan Moritz Joseph , Rainer Leupers

CINM (Cinnamon): A Compilation Infrastructure for Heterogeneous Compute In-Memory and Compute Near-Memory Paradigms

The rise of data-intensive applications exposed the limitations of conventional processor-centric von-Neumann architectures that struggle to meet the off-chip memory bandwidth demand. Therefore, recent innovations in computer architecture…

Hardware Architecture · Computer Science 2024-05-28 Asif Ali Khan , Hamid Farzaneh , Karl F. A. Friebel , Clément Fournier , Lorenzo Chelini , Jeronimo Castrillon

Compiling Neural Networks for a Computational Memory Accelerator

Computational memory (CM) is a promising approach for accelerating inference on neural networks (NN) by using enhanced memories that, in addition to storing data, allow computations on them. One of the main challenges of this approach is…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-27 Kornilios Kourtis , Martino Dazzi , Nikolas Ioannou , Tobias Grosser , Abu Sebastian , Evangelos Eleftheriou

CIMFlow: An Integrated Framework for Systematic Design and Evaluation of Digital CIM Architectures

Digital Compute-in-Memory (CIM) architectures have shown great promise in Deep Neural Network (DNN) acceleration by effectively addressing the "memory wall" bottleneck. However, the development and optimization of digital CIM accelerators…

Hardware Architecture · Computer Science 2025-05-05 Yingjie Qi , Jianlei Yang , Yiou Wang , Yikun Wang , Dayu Wang , Ling Tang , Cenlin Duan , Xiaolin He , Weisheng Zhao

DCC: Data-Centric Compilation of Machine Learning Kernels for Processing-In-Memory Architectures

High-performance Host processors can integrate Processing-In-Memory (PIM) devices, which can accelerate memory-intensive kernels of Machine Learning (ML) models, including Large Language Models (LLMs), by leveraging the large memory…

Hardware Architecture · Computer Science 2026-05-25 Peiming Yang , Sankeerth Durvasula , Ivan Fernandez , Mohammad Sadrosadati , Onur Mutlu , Gennady Pekhimenko , Christina Giannoula

PIMCOMP: A Universal Compilation Framework for Crossbar-based PIM DNN Accelerators

Crossbar-based PIM DNN accelerators can provide massively parallel in-situ operations. A specifically designed compiler is important to achieve high performance for a wide variety of DNN workloads. However, some key compilation issues such…

Hardware Architecture · Computer Science 2023-07-06 Xiaotian Sun , Xinyu Wang , Wanqian Li , Lei Wang , Yinhe Han , Xiaoming Chen

CMLCompiler: A Unified Compiler for Classical Machine Learning

Classical machine learning (CML) occupies nearly half of machine learning pipelines in production applications. Unfortunately, it fails to utilize the state-of-the-practice devices fully and performs poorly. Without a unified framework, the…

Machine Learning · Computer Science 2023-05-01 Xu Wen , Wanling Gao , Anzheng Li , Lei Wang , Zihan Jiang , Jianfeng Zhan

MLIR: A Compiler Infrastructure for the End of Moore's Law

This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building…

Programming Languages · Computer Science 2020-03-03 Chris Lattner , Mehdi Amini , Uday Bondhugula , Albert Cohen , Andy Davis , Jacques Pienaar , River Riddle , Tatiana Shpeisman , Nicolas Vasilache , Oleksandr Zinenko

WWW: What, When, Where to Compute-in-Memory

Matrix multiplication is the dominant computation during Machine Learning (ML) inference. To efficiently perform such multiplication operations, Compute-in-memory (CiM) paradigms have emerged as a highly energy efficient solution. However,…

Hardware Architecture · Computer Science 2025-03-03 Tanvi Sharma , Mustafa Ali , Indranil Chakraborty , Kaushik Roy

PIMCOMP: An End-to-End DNN Compiler for Processing-In-Memory Accelerators

Various processing-in-memory (PIM) accelerators based on various devices, micro-architectures, and interfaces have been proposed to accelerate deep neural networks (DNNs). How to deploy DNNs onto PIM-based accelerators is the key to explore…

Hardware Architecture · Computer Science 2024-11-15 Xiaotian Sun , Xinyu Wang , Wanqian Li , Yinhe Han , Xiaoming Chen

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators

Software-hardware co-design is essential for optimizing in-memory computing (IMC) hardware accelerators for neural networks. However, most existing optimization frameworks target a single workload, leading to highly specialized hardware…

Hardware Architecture · Computer Science 2026-03-05 Olga Krestinskaya , Mohammed E. Fouda , Ahmed Eltawil , Khaled N. Salama

abstractPIM: A Technology Backward-Compatible Compilation Flow for Processing-In-Memory

The von Neumann architecture, in which the memory and the computation units are separated, demands massive data traffic between the memory and the CPU. To reduce data movement, new technologies and computer architectures have been explored.…

Emerging Technologies · Computer Science 2022-09-01 Adi Eliahu , Rotem Ben-Hur , Ronny Ronen , Shahar Kvatinsky

TDO-CIM: Transparent Detection and Offloading for Computation In-memory

Computation in-memory is a promising non-von Neumann approach aiming at completely diminishing the data transfer to and from the memory subsystem. Although a lot of architectures have been proposed, compiler support for such architectures…

Hardware Architecture · Computer Science 2020-07-02 Kanishkan Vadivel , Lorenzo Chelini , Ali BanaGozar , Gagandeep Singh , Stefano Corda , Roel Jordans , Henk Corporaal

Eva-CiM: A System-Level Performance and Energy Evaluation Framework for Computing-in-Memory Architectures

Computing-in-Memory (CiM) architectures aim to reduce costly data transfers by performing arithmetic and logic operations in memory and hence relieve the pressure due to the memory wall. However, determining whether a given workload can…

Hardware Architecture · Computer Science 2020-01-16 Di Gao , Dayane Reis , Xiaobo Sharon Hu , Cheng Zhuo

LIMCA: LLM for Automating Analog In-Memory Computing Architecture Design Exploration

Resistive crossbars enabling analog In-Memory Computing (IMC) have emerged as a promising architecture for Deep Neural Network (DNN) acceleration, offering high memory bandwidth and in-situ computation. However, the manual,…

Hardware Architecture · Computer Science 2025-03-18 Deepak Vungarala , Md Hasibul Amin , Pietro Mercati , Arnob Ghosh , Arman Roohi , Ramtin Zand , Shaahin Angizi

A High-Level Compiler Integration Approach for Deep Learning Accelerators Supporting Abstraction and Optimization

The growing adoption of domain-specific architectures in edge computing platforms for deep learning has highlighted the efficiency of hardware accelerators. However, integrating custom accelerators into modern machine learning (ML)…

Machine Learning · Computer Science 2025-07-08 Samira Ahmadifarsani , Daniel Mueller-Gritschneder , Ulf Schlichtmann

CIM-Tuner: Balancing the Compute and Storage Capacity of SRAM-CIM Accelerator via Hardware-mapping Co-exploration

As an emerging type of AI computing accelerator, SRAM Computing-In-Memory (CIM) accelerators feature high energy efficiency and throughput. However, various CIM designs and under-explored mapping strategies impede the full exploration of…

Hardware Architecture · Computer Science 2026-01-27 Jinwu Chen , Yuhui Shi , He Wang , Zhe Jiang , Jun Yang , Xin Si , Zhenhua Zhu

Breaking Barriers: Maximizing Array Utilization for Compute In-Memory Fabrics

Compute in-memory (CIM) is a promising technique that minimizes data transport, the primary performance bottleneck and energy cost of most data intensive applications. This has found wide-spread adoption in accelerating neural networks for…

Hardware Architecture · Computer Science 2020-08-18 Brian Crafton , Samuel Spetalnick , Gauthaman Murali , Tushar Krishna , Sung-Kyu Lim , Arijit Raychowdhury