Related papers: 3D-TrIM: A Memory-Efficient Spatial Computing Arch…

TrIM, Triangular Input Movement Systolic Array for Convolutional Neural Networks: Dataflow and Analytical Modelling

In order to follow the ever-growing computational complexity and data intensity of state-of-the-art AI models, new computing paradigms are being proposed. These paradigms aim at achieving high energy efficiency by mitigating the Von Neumann…

Artificial Intelligence · Computer Science 2025-08-01 Cristian Sestito , Shady Agwa , Themis Prodromakis

TrIM, Triangular Input Movement Systolic Array for Convolutional Neural Networks: Architecture and Hardware Implementation

Modern hardware architectures for Convolutional Neural Networks (CNNs), other than targeting high performance, aim at dissipating limited energy. Reducing the data movement cost between the computing cores and the memory is a way to…

Hardware Architecture · Computer Science 2025-01-15 Cristian Sestito , Shady Agwa , Themis Prodromakis

Triangle Counting Accelerations: From Algorithm to In-Memory Computing Architecture

Triangles are the basic substructure of networks and triangle counting (TC) has been a fundamental graph computing problem in numerous fields such as social network analysis. Nevertheless, like other graph computing problems, due to the…

Hardware Architecture · Computer Science 2021-12-02 Xueyan Wang , Jianlei Yang , Yinglin Zhao , Xiaotao Jia , Rong Yin , Xuhang Chen , Gang Qu , Weisheng Zhao

Optimizing and Exploring System Performance in Compact Processing-in-Memory-based Chips

Processing-in-memory (PIM) is a promising computing paradigm to tackle the "memory wall" challenge. However, PIM system-level benefits over traditional von Neumann architecture can be reduced when the memory array cannot fully store all the…

Hardware Architecture · Computer Science 2025-03-03 Peilin Chen , Xiaoxuan Yang

A Time- and Energy-Efficient CNN with Dense Connections on Memristor-Based Chips

Designing lightweight convolutional neural network (CNN) models is an active research area in edge AI. Compute-in-memory (CIM) provides a new computing paradigm to alleviate time and energy consumption caused by data transfer in von Neumann…

Hardware Architecture · Computer Science 2025-08-19 Wenyong Zhou , Yuan Ren , Jiajun Zhou , Tianshu Hou , Ngai Wong

Domino: A Tailored Network-on-Chip Architecture to Enable Highly Localized Inter- and Intra-Memory DNN Computing

The ever-increasing computation complexity of fast-growing Deep Neural Networks (DNNs) has requested new computing paradigms to overcome the memory wall in conventional Von Neumann computing architectures. The emerging Computing-In-Memory…

Hardware Architecture · Computer Science 2021-07-21 Kaining Zhou , Yangshuo He , Rui Xiao , Kejie Huang

Computing-In-Memory Dataflow for Minimal Buffer Traffic

Computing-In-Memory (CIM) offers a potential solution to the memory wall issue and can achieve high energy efficiency by minimizing data movement, making it a promising architecture for edge AI devices. Lightweight models like MobileNet and…

Hardware Architecture · Computer Science 2025-08-21 Choongseok Song , Doo Seok Jeong

Processing Data Where It Makes Sense: Enabling In-Memory Computation

Today's systems are overwhelmingly designed to move data to computation. This design choice goes directly against at least three key trends in systems that cause performance, scalability and energy bottlenecks: (1) data access from memory…

Hardware Architecture · Computer Science 2019-03-12 Onur Mutlu , Saugata Ghose , Juan Gómez-Luna , Rachata Ausavarungnirun

Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions

Poor DRAM technology scaling over the course of many years has caused DRAM-based main memory to increasingly become a larger system bottleneck. A major reason for the bottleneck is that data stored within DRAM must be moved across a…

Hardware Architecture · Computer Science 2018-02-02 Saugata Ghose , Kevin Hsieh , Amirali Boroumand , Rachata Ausavarungnirun , Onur Mutlu

A Customized NoC Architecture to Enable Highly Localized Computing-On-the-Move DNN Dataflow

The ever-increasing computation complexity of fastgrowing Deep Neural Networks (DNNs) has requested new computing paradigms to overcome the memory wall in conventional Von Neumann computing architectures. The emerging Computing-In-Memory…

Hardware Architecture · Computer Science 2021-12-14 Kaining Zhou , Yangshuo He , Rui Xiao , Jiayi Liu , Kejie Huang

Efficient SRAM-PIM Co-design by Joint Exploration of Value-Level and Bit-Level Sparsity

Processing-in-memory (PIM) is a transformative architectural paradigm designed to overcome the Von Neumann bottleneck. Among PIM architectures, digital SRAM-PIM emerges as a promising solution, offering significant advantages by directly…

Hardware Architecture · Computer Science 2025-06-13 Cenlin Duan , Jianlei Yang , Yikun Wang , Yiou Wang , Yingjie Qi , Xiaolin He , Bonan Yan , Xueyan Wang , Xiaotao Jia , Weisheng Zhao

Breaking Barriers: Maximizing Array Utilization for Compute In-Memory Fabrics

Compute in-memory (CIM) is a promising technique that minimizes data transport, the primary performance bottleneck and energy cost of most data intensive applications. This has found wide-spread adoption in accelerating neural networks for…

Hardware Architecture · Computer Science 2020-08-18 Brian Crafton , Samuel Spetalnick , Gauthaman Murali , Tushar Krishna , Sung-Kyu Lim , Arijit Raychowdhury

An Overview of In-memory Processing with Emerging Non-volatile Memory for Data-intensive Applications

The conventional von Neumann architecture has been revealed as a major performance and energy bottleneck for rising data-intensive applications. %, due to the intensive data movements. The decade-old idea of leveraging in-memory processing…

Hardware Architecture · Computer Science 2019-06-18 Bing Li , Bonan Yan , Hai , Li

NAND-SPIN-Based Processing-in-MRAM Architecture for Convolutional Neural Network Acceleration

The performance and efficiency of running large-scale datasets on traditional computing systems exhibit critical bottlenecks due to the existing "power wall" and "memory wall" problems. To resolve those problems, processing-in-memory (PIM)…

Hardware Architecture · Computer Science 2022-04-22 Yinglin Zhao , Jianlei Yang , Bing Li , Xingzhou Cheng , Xucheng Ye , Xueyan Wang , Xiaotao Jia , Zhaohao Wang , Youguang Zhang , Weisheng Zhao

Full-Stack Optimization for CAM-Only DNN Inference

The accuracy of neural networks has greatly improved across various domains over the past years. Their ever-increasing complexity, however, leads to prohibitively high energy demands and latency in von Neumann systems. Several…

Hardware Architecture · Computer Science 2024-01-24 João Paulo C. de Lima , Asif Ali Khan , Luigi Carro , Jeronimo Castrillon

ConvPIM: Evaluating Digital Processing-in-Memory through Convolutional Neural Network Acceleration

Processing-in-memory (PIM) architectures are emerging to reduce data movement in data-intensive applications. These architectures seek to exploit the same physical devices for both information storage and logic, thereby dwarfing the…

Hardware Architecture · Computer Science 2023-05-09 Orian Leitersdorf , Ronny Ronen , Shahar Kvatinsky

NicePIM: Design Space Exploration for Processing-In-Memory DNN Accelerators with 3D-Stacked-DRAM

With the widespread use of deep neural networks(DNNs) in intelligent systems, DNN accelerators with high performance and energy efficiency are greatly demanded. As one of the feasible processing-in-memory(PIM) architectures,…

Hardware Architecture · Computer Science 2023-12-22 Junpeng Wang , Mengke Ge , Bo Ding , Qi Xu , Song Chen , Yi Kang

An Energy-Efficient Heterogeneous Memory Architecture for Future Dark Silicon Embedded Chip-Multiprocessors

Main memories play an important role in overall energy consumption of embedded systems. Using conventional memory technologies in future designs in nanoscale era causes a drastic increase in leakage power consumption and temperature-related…

Hardware Architecture · Computer Science 2019-12-16 Salman Onsori , Arghavan Asad , Kaamran Raahemifar , Mahmood Fathy

DL-PIM: Improving Data Locality in Processing-in-Memory Systems

PIM architectures aim to reduce data transfer costs between processors and memory by integrating processing units within memory layers. Prior PIM architectures have shown potential to improve energy efficiency and performance. However, such…

Hardware Architecture · Computer Science 2025-10-10 Parker Hao Tian , Zahra Yousefijamarani , Alaa Alameldeen

Im2win: Memory Efficient Convolution On SIMD Architectures

Convolution is the most expensive operation among neural network operations, thus its performance is critical to the overall performance of neural networks. Commonly used convolution approaches, including general matrix multiplication…

Neural and Evolutionary Computing · Computer Science 2023-06-27 Shuai Lu , Jun Chu , Xu T. Liu