Related papers: Accelerating DNN Training with Structured Data Gra…

StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs

Weight pruning methods of DNNs have been demonstrated to achieve a good model pruning rate without loss of accuracy, thereby alleviating the significant computation/storage requirements of large-scale DNNs. Structured weight pruning methods…

Neural and Evolutionary Computing · Computer Science 2019-03-28 Tianyun Zhang , Shaokai Ye , Kaiqi Zhang , Xiaolong Ma , Ning Liu , Linfeng Zhang , Jian Tang , Kaisheng Ma , Xue Lin , Makan Fardad , Yanzhi Wang

Speedup deep learning models on GPU by taking advantage of efficient unstructured pruning and bit-width reduction

This work is focused on the pruning of some convolutional neural networks (CNNs) and improving theirs efficiency on graphic processing units (GPU) by using a direct sparse algorithm. The Nvidia deep neural network (cuDnn) library is the…

Machine Learning · Computer Science 2022-08-30 Marcin Pietroń , Dominik Żurek

When deep learning models on GPU can be accelerated by taking advantage of unstructured sparsity

This paper is focused on the improvement the efficiency of the sparse convolutional neural networks (CNNs) layers on graphic processing units (GPU). The Nvidia deep neural network (cuDnn) library provides the most effective implementation…

Machine Learning · Computer Science 2022-01-03 Marcin Pietroń , Dominik Żurek

Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices. DNN pruning is an approach for deep model compression, which aims at…

Machine Learning · Computer Science 2019-10-28 Xiaohan Ding , Guiguang Ding , Xiangxin Zhou , Yuchen Guo , Jungong Han , Ji Liu

Efficient Micro-Structured Weight Unification and Pruning for Neural Network Compression

Compressing Deep Neural Network (DNN) models to alleviate the storage and computation requirements is essential for practical applications, especially for resource limited devices. Although capable of reducing a reasonable amount of model…

Machine Learning · Computer Science 2021-06-17 Sheng Lin , Wei Jiang , Wei Wang , Kaidi Xu , Yanzhi Wang , Shan Liu , Songnan Li

Weight Block Sparsity: Training, Compilation, and AI Engine Accelerators

Nowadays, increasingly larger Deep Neural Networks (DNNs) are being developed, trained, and utilized. These networks require significant computational resources, putting a strain on both advanced and limited devices. Our solution is to…

Machine Learning · Computer Science 2024-07-16 Paolo D'Alberto , Taehee Jeong , Akshai Jain , Shreyas Manjunath , Mrinal Sarmah , Samuel Hsu , Yaswanth Raparti , Nitesh Pipralia

Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training

The success of DNN pruning has led to the development of energy-efficient inference accelerators that support pruned models with sparse weight and activation tensors. Because the memory layouts and dataflows in these architectures are…

Neural and Evolutionary Computing · Computer Science 2020-09-24 Dingqing Yang , Amin Ghasemazar , Xiaowei Ren , Maximilian Golub , Guy Lemieux , Mieszko Lis

Full deep neural network training on a pruned weight budget

We introduce a DNN training technique that learns only a fraction of the full parameter set without incurring an accuracy penalty. To do this, our algorithm constrains the total number of weights updated during backpropagation to those with…

Machine Learning · Computer Science 2019-11-26 Maximilian Golub , Guy Lemieux , Mieszko Lis

Accelerating Training of Deep Neural Networks via Sparse Edge Processing

We propose a reconfigurable hardware architecture for deep neural networks (DNNs) capable of online training and inference, which uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational…

Neural and Evolutionary Computing · Computer Science 2017-11-07 Sourya Dey , Yinan Shao , Keith M. Chugg , Peter A. Beerel

SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference

In recent years, there has been a flurry of research in deep neural network pruning and compression. Early approaches prune weights individually. However, it is difficult to take advantage of the resulting unstructured sparsity patterns on…

Machine Learning · Computer Science 2020-08-28 Ziheng Wang

PruneTrain: Fast Neural Network Training by Dynamic Sparse Model Reconfiguration

State-of-the-art convolutional neural networks (CNNs) used in vision applications have large models with numerous weights. Training these models is very compute- and memory-resource intensive. Much research has been done on pruning or…

Machine Learning · Computer Science 2019-12-10 Sangkug Lym , Esha Choukse , Siavash Zangeneh , Wei Wen , Sujay Sanghavi , Mattan Erez

Balanced Sparsity for Efficient DNN Inference on GPU

In trained deep neural networks, unstructured pruning can reduce redundant weights to lower storage cost. However, it requires the customization of hardwares to speed up practical inference. Another trend accelerates sparse model inference…

Computer Vision and Pattern Recognition · Computer Science 2020-10-30 Zhuliang Yao , Shijie Cao , Wencong Xiao , Chen Zhang , Lanshun Nie

Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design

Sparse training is one of the promising techniques to reduce the computational cost of DNNs while retaining high accuracy. In particular, N:M fine-grained structured sparsity, where only N out of consecutive M elements can be nonzero, has…

Machine Learning · Computer Science 2023-09-25 Chao Fang , Wei Sun , Aojun Zhou , Zhongfeng Wang

Effective Model Sparsification by Scheduled Grow-and-Prune Methods

Deep neural networks (DNNs) are effective in solving many real-world problems. Larger DNN models usually exhibit better quality (e.g., accuracy) but their excessive computation results in long inference time. Model sparsification can reduce…

Computer Vision and Pattern Recognition · Computer Science 2022-03-07 Xiaolong Ma , Minghai Qin , Fei Sun , Zejiang Hou , Kun Yuan , Yi Xu , Yanzhi Wang , Yen-Kuang Chen , Rong Jin , Yuan Xie

Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity

Network pruning can reduce the high computation cost of deep neural network (DNN) models. However, to maintain their accuracies, sparse models often carry randomly-distributed weights, leading to irregular computations. Consequently, sparse…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-01 Cong Guo , Bo Yang Hsueh , Jingwen Leng , Yuxian Qiu , Yue Guan , Zehuan Wang , Xiaoying Jia , Xipeng Li , Minyi Guo , Yuhao Zhu

Towards Sparsification of Graph Neural Networks

As real-world graphs expand in size, larger GNN models with billions of parameters are deployed. High parameter count in such models makes training and inference on graphs expensive and challenging. To reduce the computational and memory…

Machine Learning · Computer Science 2023-02-27 Hongwu Peng , Deniz Gurevin , Shaoyi Huang , Tong Geng , Weiwen Jiang , Omer Khan , Caiwen Ding

SPDY: Accurate Pruning with Speedup Guarantees

The recent focus on the efficiency of deep neural networks (DNNs) has led to significant work on model compression approaches, of which weight pruning is one of the most popular. At the same time, there is rapidly-growing computational…

Machine Learning · Computer Science 2022-08-25 Elias Frantar , Dan Alistarh

Progressive Gradient Pruning for Classification, Detection and DomainAdaptation

Although deep neural networks (NNs) have achievedstate-of-the-art accuracy in many visual recognition tasks,the growing computational complexity and energy con-sumption of networks remains an issue, especially for ap-plications on platforms…

Machine Learning · Computer Science 2020-02-26 Le Thanh Nguyen-Meidine , Eric Granger , Madhu Kiran , Louis-Antoine Blais-Morin , Marco Pedersoli

Designing Semi-Structured Pruning of Graph Convolutional Networks for Skeleton-based Recognition

Deep neural networks (DNNs) are nowadays witnessing a major success in solving many pattern recognition tasks including skeleton-based classification. The deployment of DNNs on edge-devices, endowed with limited time and memory resources,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Hichem Sahbi

Shfl-BW: Accelerating Deep Neural Network Inference with Tensor-Core Aware Weight Pruning

Weight pruning in deep neural networks (DNNs) can reduce storage and computation cost, but struggles to bring practical speedup to the model inference time. Tensor-cores can significantly boost the throughput of GPUs on dense computation,…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-15 Guyue Huang , Haoran Li , Minghai Qin , Fei Sun , Yufei Ding , Yuan Xie