English
Related papers

Related papers: Flexagon: A Multi-Dataflow Sparse-Sparse Matrix Mu…

200 papers

Sparse matrix-matrix multiplication (SpGEMM) is a critical operation in numerous fields, including scientific computing, graph analytics, and deep learning. These applications exploit the sparsity of matrices to reduce storage and…

Machine Learning · Computer Science 2024-08-30 Sanjali Yadav , Bahar Asgari

Artificial Intelligence (AI) algorithms, such as Deep Neural Networks (DNNs), have become an important tool for a wide range of applications, from computer vision to natural language processing. However, the computational complexity of DNN…

Sparse Matrix-matrix Multiplication (SpMM) and Sampled Dense-dense Matrix Multiplication (SDDMM) are important sparse operators in scientific computing and deep learning. Tensor Core Units (TCUs) enhance modern accelerators with superior…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-17 Jinliang Shi , Shigang Li , Youxuan Xu , Rongtian Fu , Xueying Wang , Tong Wu

Contemporary Deep Neural Network (DNN) contains millions of synaptic connections with tens to hundreds of layers. The large computation and memory requirements pose a challenge to the hardware design. In this work, we leverage the intrinsic…

Machine Learning · Computer Science 2017-11-07 Jingyang Zhu , Jingbo Jiang , Xizi Chen , Chi-Ying Tsui

Graph Convolutional Networks (GCNs) are widely adopted for tasks involving relational or graph-structured data and can be formulated as two-stage sparse-dense matrix multiplication (SpMM) during inference. However, existing accelerators…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-14 Bohan Li , Shengmin Li , Xinyu Shi , Enyi Yao , Francky Catthoor , Simei Yang

This paper introduces FlexNN, a Flexible Neural Network accelerator, which adopts agile design principles to enable versatile dataflows, enhancing energy efficiency. Unlike conventional convolutional neural network accelerator architectures…

Hardware Architecture · Computer Science 2025-06-27 Arnab Raha , Deepak A. Mathaikutty , Soumendu K. Ghosh , Shamik Kundu

Sparse matrix-vector and matrix-matrix multiplication (SpMV and SpMM) are fundamental in both conventional (graph analytics, scientific computing) and emerging (sparse DNN, GNN) domains. Workload-balancing and parallel-reduction are…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-15 Guyue Huang , Guohao Dai , Yu Wang , Yufei Ding , Yuan Xie

Generalized Sparse Matrix-Matrix Multiplication (SpGEMM) is a ubiquitous task in various engineering and scientific applications. However, inner product based SpGENN introduces redundant input fetches for mismatched nonzero operands, while…

Hardware Architecture · Computer Science 2024-04-05 Zhekai Zhang , Hanrui Wang , Song Han , William J. Dally

Deep learning demonstrates effectiveness across a wide range of tasks. However, the dense and over-parameterized nature of these models results in significant resource consumption during deployment. In response to this issue, weight…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-05 Cong Ma , Du Wu , Zhelang Deng , Jiang Chen , Xiaowen Huang , Jintao Meng , Wenxi Zhu , Bingqiang Wang , Amelie Chi Zhou , Peng Chen , Minwen Deng , Yanjie Wei , Shengzhong Feng , Yi Pan

Deep Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance in a wide range of applications. However, deeper CNN models, which are usually computation consuming, are widely required for complex Artificial…

Systems and Control · Electrical Eng. & Systems 2020-01-08 Chaoyang Zhu , Kejie Huang , Shuyuan Yang , Ziqi Zhu , Hejia Zhang , Haibin Shen

Deep Neural Network (DNN) based inference at the edge is challenging as these compute and data-intensive algorithms need to be implemented at low cost and low power while meeting the latency constraints of the target applications. Sparsity,…

Neural and Evolutionary Computing · Computer Science 2023-06-13 Adithya Krishna , Srikanth Rohit Nudurupati , Chandana D G , Pritesh Dwivedi , André van Schaik , Mahesh Mehendale , Chetan Singh Thakur

General sparse matrix-matrix multiplication (SpGEMM) is an integral part of many scientific computing, high-performance computing (HPC), and graph analytic applications. This paper presents a new compressed sparse vector (CSV) format for…

Performance · Computer Science 2021-12-21 Erfan Bank Tavakoli , Michael Riera , Masudul Hassan Quraishi , Fengbo Ren

Accelerators for sparse matrix multiplication are important components in emerging systems. In this paper, we study the main challenges of accelerating Sparse Matrix Multiplication (SpMM). For the situations that data is not stored in the…

Hardware Architecture · Computer Science 2019-06-04 Pareesa Ameneh Golnari , Sharad Malik

The rapid advancement of wireless communication technologies, including 5G, emerging 6G networks, and the large-scale deployment of the Internet of Things (IoT), has intensified the need for efficient spectrum utilization. Automatic…

Hardware Architecture · Computer Science 2026-01-07 Kuilian Yang , Li Zhang , Ahmed M. Eltawil , Khaled Nabil Salama

Spiking Neural Networks (SNNs) have gained significant research attention in the last decade due to their potential to drive resource-constrained edge devices. Though existing SNN accelerators offer high efficiency in processing sparse…

Hardware Architecture · Computer Science 2024-09-04 Ruokai Yin , Youngeun Kim , Di Wu , Priyadarshini Panda

Fueled by the ability to mine real-world graph data, GNN applications have experienced phenomenal growth. Sparse Matrix-Matrix Multiplication (SpMM) is a critical operator in GNNs. However, existing SpMM designs for GNNs struggle to adapt…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-18 Lixing Zhang , Guanhua Ye , Hongzheng Li , Shigang Li , Yingxia Shao

This paper introduces the sparse periodic systolic (SPS) dataflow, which advances the state-of-the-art hardware accelerator for supporting lightweight neural networks. Specifically, the SPS dataflow enables a novel hardware design approach…

Computer Vision and Pattern Recognition · Computer Science 2022-07-04 Jung Hwan Heo , Arash Fayyazi , Amirhossein Esmaili , Massoud Pedram

With the increasing data volume, there is a trend of using large-scale pre-trained models to store the knowledge into an enormous number of model parameters. The training of these models is composed of lots of dense algebras, requiring a…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-11 Xiaonan Nie , Xupeng Miao , Zilong Wang , Zichao Yang , Jilong Xue , Lingxiao Ma , Gang Cao , Bin Cui

To address the challenge of increasing network size, researchers have developed sparse models through network pruning. However, maintaining model accuracy while achieving significant speedups on general computing devices remains an open…

Artificial Intelligence · Computer Science 2023-10-31 Haitao Xu , Songwei Liu , Yuyang Xu , Shuai Wang , Jiashi Li , Chenqian Yan , Liangqiang Li , Lean Fu , Xin Pan , Fangmin Chen

General-purpose Sparse Matrix-Matrix Multiplication (SpMM) is a fundamental kernel in scientific computing and deep learning. The emergence of new matrix computation units such as Tensor Cores (TCs) brings more opportunities for SpMM…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-17 Haisha Zhao , San Li , Jiaheng Wang , Chunbao Zhou , Jue Wang , Zhikuang Xin , Shunde Li , Zhiqiang Liang , Zhijie Pan , Fang Liu , Yan Zeng , Yangang Wang , Xuebin Chi
‹ Prev 1 2 3 10 Next ›