English
Related papers

Related papers: Accelerating Sparse DNNs Based on Tiled GEMM

200 papers

Network pruning can reduce the high computation cost of deep neural network (DNN) models. However, to maintain their accuracies, sparse models often carry randomly-distributed weights, leading to irregular computations. Consequently, sparse…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-01 Cong Guo , Bo Yang Hsueh , Jingwen Leng , Yuxian Qiu , Yue Guan , Zehuan Wang , Xiaoying Jia , Xipeng Li , Minyi Guo , Yuhao Zhu

Exploiting sparsity in deep neural networks (DNNs) has been a promising area for meeting the growing computation requirements. To minimize the overhead of sparse acceleration, hardware designers have proposed structured sparsity support,…

Machine Learning · Computer Science 2025-05-27 Geonhwa Jeong , Po-An Tsai , Abhimanyu R. Bambhaniya , Stephen W. Keckler , Tushar Krishna

Weight pruning in deep neural networks (DNNs) can reduce storage and computation cost, but struggles to bring practical speedup to the model inference time. Tensor-cores can significantly boost the throughput of GPUs on dense computation,…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-15 Guyue Huang , Haoran Li , Minghai Qin , Fei Sun , Yufei Ding , Yuan Xie

The demand for efficient processing of deep neural networks (DNNs) on embedded devices is a significant challenge limiting their deployment. Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference…

Computer Vision and Pattern Recognition · Computer Science 2023-09-28 Matteo Grimaldi , Darshan C. Ganji , Ivan Lazarevich , Sudhakar Sah

Sparsity helps reduce the computational complexity of deep neural networks by skipping zeros. Taking advantage of sparsity is listed as a high priority in next generation DNN accelerators such as TPU. The structure of sparsity, i.e., the…

Machine Learning · Computer Science 2017-06-06 Huizi Mao , Song Han , Jeff Pool , Wenshuo Li , Xingyu Liu , Yu Wang , William J. Dally

As neural network model sizes have dramatically increased, so has the interest in various techniques to reduce their parameter counts and accelerate their execution. An active area of research in this field is sparsity - encouraging zero…

The success of DNN pruning has led to the development of energy-efficient inference accelerators that support pruned models with sparse weight and activation tensors. Because the memory layouts and dataflows in these architectures are…

Neural and Evolutionary Computing · Computer Science 2020-09-24 Dingqing Yang , Amin Ghasemazar , Xiaowei Ren , Maximilian Golub , Guy Lemieux , Mieszko Lis

Deep neural networks (DNNs) have been proven to be effective in solving many real-life problems, but its high computation cost prohibits those models from being deployed to edge devices. Pruning, as a method to introduce zeros to model…

Machine Learning · Computer Science 2021-12-22 Fei Sun , Minghai Qin , Tianyun Zhang , Xiaolong Ma , Haoran Li , Junwen Luo , Zihao Zhao , Yen-Kuang Chen , Yuan Xie

Training deep neural networks (DNNs) is costly. Fortunately, Nvidia Ampere and Hopper GPUs can accelerate matrix multiplications twice as fast as a dense equivalent by implementing 2:4 sparsity. However, previous STE-based 2:4 pre-training…

Machine Learning · Computer Science 2024-12-30 Yuezhou Hu , Jun Zhu , Jianfei Chen

In trained deep neural networks, unstructured pruning can reduce redundant weights to lower storage cost. However, it requires the customization of hardwares to speed up practical inference. Another trend accelerates sparse model inference…

Computer Vision and Pattern Recognition · Computer Science 2020-10-30 Zhuliang Yao , Shijie Cao , Wencong Xiao , Chen Zhang , Lanshun Nie

We propose a reconfigurable hardware architecture for deep neural networks (DNNs) capable of online training and inference, which uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational…

Neural and Evolutionary Computing · Computer Science 2017-11-07 Sourya Dey , Yinan Shao , Keith M. Chugg , Peter A. Beerel

Weight pruning methods of DNNs have been demonstrated to achieve a good model pruning rate without loss of accuracy, thereby alleviating the significant computation/storage requirements of large-scale DNNs. Structured weight pruning methods…

Neural and Evolutionary Computing · Computer Science 2019-03-28 Tianyun Zhang , Shaokai Ye , Kaiqi Zhang , Xiaolong Ma , Ning Liu , Linfeng Zhang , Jian Tang , Kaisheng Ma , Xue Lin , Makan Fardad , Yanzhi Wang

As the size of Deep Neural Networks (DNNs) increases dramatically to achieve high accuracy, the DNNs require a large amount of computations and memory footprint. Pruning, which produces a sparse neural network, is one of the solutions to…

Hardware Architecture · Computer Science 2026-04-30 Hyunsung Yoon , Sungju Ryu , Jae-Joon Kim

Phenomenally successful in practical inference problems, convolutional neural networks (CNN) are widely deployed in mobile devices, data centers, and even supercomputers. The number of parameters needed in CNNs, however, are often large and…

Computer Vision and Pattern Recognition · Computer Science 2017-08-01 Jongsoo Park , Sheng Li , Wei Wen , Ping Tak Peter Tang , Hai Li , Yiran Chen , Pradeep Dubey

The rise of Deep Neural Networks (DNNs) has led to an increase in model size and complexity, straining the memory capacity of GPUs. Sparsity in DNNs, characterized as structural or ephemeral, has gained attention as a solution. This work…

Machine Learning · Computer Science 2023-11-30 Daniel Barley , Holger Fröning

Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by reducing the number of model parameters over the course of training. However, most weight pruning techniques generally does not…

Machine Learning · Computer Science 2022-02-03 Bradley McDanel , Helia Dinh , John Magallanes

In recent years, there has been a flurry of research in deep neural network pruning and compression. Early approaches prune weights individually. However, it is difficult to take advantage of the resulting unstructured sparsity patterns on…

Machine Learning · Computer Science 2020-08-28 Ziheng Wang

The state-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. The size of both training data and models continue to increase. Sparsification and pruning methods are shown to be effective…

Machine Learning · Computer Science 2021-04-27 Gunduz Vehbi Demirci , Hakan Ferhatosmanoglu

Deep neural networks (DNNs) are effective in solving many real-world problems. Larger DNN models usually exhibit better quality (e.g., accuracy) but their excessive computation results in long inference time. Model sparsification can reduce…

Computer Vision and Pattern Recognition · Computer Science 2022-03-07 Xiaolong Ma , Minghai Qin , Fei Sun , Zejiang Hou , Kun Yuan , Yi Xu , Yanzhi Wang , Yen-Kuang Chen , Rong Jin , Yuan Xie

Sparse training is one of the promising techniques to reduce the computational cost of DNNs while retaining high accuracy. In particular, N:M fine-grained structured sparsity, where only N out of consecutive M elements can be nonzero, has…

Machine Learning · Computer Science 2023-09-25 Chao Fang , Wei Sun , Aojun Zhou , Zhongfeng Wang
‹ Prev 1 2 3 10 Next ›