Related papers: BLK-REW: A Unified Block-based DNN Pruning Framewo…

A Unified DNN Weight Compression Framework Using Reweighted Optimization Methods

To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i.e., static regularization-based pruning and dynamic…

Machine Learning · Computer Science 2020-04-14 Tianyun Zhang , Xiaolong Ma , Zheng Zhan , Shanglin Zhou , Minghai Qin , Fei Sun , Yen-Kuang Chen , Caiwen Ding , Makan Fardad , Yanzhi Wang

CSB-RNN: A Faster-than-Realtime RNN Acceleration Framework with Compressed Structured Blocks

Recurrent neural networks (RNNs) have been widely adopted in temporal sequence analysis, where realtime performance is often in demand. However, RNNs suffer from heavy computational workload as the model often comes with large weight…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-13 Runbin Shi , Peiyan Dong , Tong Geng , Yuhao Ding , Xiaolong Ma , Hayden K. -H. So , Martin Herbordt , Ang Li , Yanzhi Wang

StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs

Weight pruning methods of DNNs have been demonstrated to achieve a good model pruning rate without loss of accuracy, thereby alleviating the significant computation/storage requirements of large-scale DNNs. Structured weight pruning methods…

Neural and Evolutionary Computing · Computer Science 2019-03-28 Tianyun Zhang , Shaokai Ye , Kaiqi Zhang , Xiaolong Ma , Ning Liu , Linfeng Zhang , Jian Tang , Kaisheng Ma , Xue Lin , Makan Fardad , Yanzhi Wang

Efficient Micro-Structured Weight Unification and Pruning for Neural Network Compression

Compressing Deep Neural Network (DNN) models to alleviate the storage and computation requirements is essential for practical applications, especially for resource limited devices. Although capable of reducing a reasonable amount of model…

Machine Learning · Computer Science 2021-06-17 Sheng Lin , Wei Jiang , Wei Wang , Kaidi Xu , Yanzhi Wang , Shan Liu , Songnan Li

A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework

Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices. However, previous pruning methods mainly focus on reducing the model size and/or improving…

Machine Learning · Computer Science 2022-03-29 Yifan Gong , Zheng Zhan , Zhengang Li , Wei Niu , Xiaolong Ma , Wenhao Wang , Bin Ren , Caiwen Ding , Xue Lin , Xiaolin Xu , Yanzhi Wang

A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM

Many model compression techniques of Deep Neural Networks (DNNs) have been investigated, including weight pruning, weight clustering and quantization, etc. Weight pruning leverages the redundancy in the number of weights in DNNs, while…

Neural and Evolutionary Computing · Computer Science 2018-11-06 Shaokai Ye , Tianyun Zhang , Kaiqi Zhang , Jiayu Li , Jiaming Xie , Yun Liang , Sijia Liu , Xue Lin , Yanzhi Wang

Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration

Weight pruning is an effective model compression technique to tackle the challenges of achieving real-time deep neural network (DNN) inference on mobile devices. However, prior pruning schemes have limited application scenarios due to…

Machine Learning · Computer Science 2022-03-29 Yifan Gong , Geng Yuan , Zheng Zhan , Wei Niu , Zhengang Li , Pu Zhao , Yuxuan Cai , Sijia Liu , Bin Ren , Xue Lin , Xulong Tang , Yanzhi Wang

ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Method of Multipliers

To facilitate efficient embedded and hardware implementations of deep neural networks (DNNs), two important categories of DNN model compression techniques: weight pruning and weight quantization are investigated. The former leverages the…

Machine Learning · Computer Science 2019-01-03 Ao Ren , Tianyun Zhang , Shaokai Ye , Jiayu Li , Wenyao Xu , Xuehai Qian , Xue Lin , Yanzhi Wang

A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers

Weight pruning methods for deep neural networks (DNNs) have been investigated recently, but prior work in this area is mainly heuristic, iterative pruning, thereby lacking guarantees on the weight reduction ratio and convergence time. To…

Neural and Evolutionary Computing · Computer Science 2018-10-23 Tianyun Zhang , Shaokai Ye , Kaiqi Zhang , Jian Tang , Wujie Wen , Makan Fardad , Yanzhi Wang

SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency

Structured weight pruning is a representative model compression technique of DNNs for hardware efficiency and inference accelerations. Previous works in this area leave great space for improvement since sparse structures with combinations…

Machine Learning · Computer Science 2020-02-11 Zhengang Li , Yifan Gong , Xiaolong Ma , Sijia Liu , Mengshu Sun , Zheng Zhan , Zhenglun Kong , Geng Yuan , Yanzhi Wang

ResNet Can Be Pruned 60x: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning

The state-of-art DNN structures involve high computation and great demand for memory storage which pose intensive challenge on DNN framework resources. To mitigate the challenges, weight pruning techniques has been studied. However, high…

Machine Learning · Computer Science 2019-05-03 Xiaolong Ma , Geng Yuan , Sheng Lin , Zhengang Li , Hao Sun , Yanzhi Wang

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

Weight pruning and weight quantization are two important categories of DNN model compression. Prior work on these techniques are mainly based on heuristics. A recent work developed a systematic frame-work of DNN weight pruning using the…

Neural and Evolutionary Computing · Computer Science 2019-04-02 Shaokai Ye , Xiaoyu Feng , Tianyun Zhang , Xiaolong Ma , Sheng Lin , Zhengang Li , Kaidi Xu , Wujie Wen , Sijia Liu , Jian Tang , Makan Fardad , Xue Lin , Yongpan Liu , Yanzhi Wang

Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization

Deep Neural Networks (DNNs) have shown significant advantages in a wide variety of domains. However, DNNs are becoming computationally intensive and energy hungry at an exponential pace, while at the same time, there is a vast demand for…

Machine Learning · Computer Science 2023-12-27 Konstantinos Balaskas , Andreas Karatzas , Christos Sad , Kostas Siozios , Iraklis Anagnostopoulos , Georgios Zervakis , Jörg Henkel

Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework

Most neural network pruning methods, such as filter-level and layer-level prunings, prune the network model along one dimension (depth, width, or resolution) solely to meet a computational budget. However, such a pruning policy often leads…

Computer Vision and Pattern Recognition · Computer Science 2021-06-16 Wenxiao Wang , Minghao Chen , Shuai Zhao , Long Chen , Jinming Hu , Haifeng Liu , Deng Cai , Xiaofei He , Wei Liu

Three Dimensional Convolutional Neural Network Pruning with Regularization-Based Method

Despite enjoying extensive applications in video analysis, three-dimensional convolutional neural networks (3D CNNs)are restricted by their massive computation and storage consumption. To solve this problem, we propose a threedimensional…

Machine Learning · Computer Science 2019-05-21 Yuxin Zhang , Huan Wang , Yang Luo , Lu Yu , Haoji Hu , Hangguan Shan , Tony Q. S. Quek

Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation

The state-of-art DNN structures involve intensive computation and high memory storage. To mitigate the challenges, the memristor crossbar array has emerged as an intrinsically suitable matrix computation and low-power acceleration framework…

Signal Processing · Electrical Eng. & Systems 2019-09-04 Xiaolong Ma , Geng Yuan , Sheng Lin , Caiwen Ding , Fuxun Yu , Tao Liu , Wujie Wen , Xiang Chen , Yanzhi Wang

An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM

The high computation and memory storage of large deep neural networks (DNNs) models pose intensive challenges to the conventional Von-Neumann architecture, incurring substantial data movements in the memory hierarchy. The memristor crossbar…

Emerging Technologies · Computer Science 2019-09-02 Geng Yuan , Xiaolong Ma , Caiwen Ding , Sheng Lin , Tianyun Zhang , Zeinab S. Jalali , Yilong Zhao , Li Jiang , Sucheta Soundarajan , Yanzhi Wang

FeTa: A DCA Pruning Algorithm with Generalization Error Guarantees

Recent DNN pruning algorithms have succeeded in reducing the number of parameters in fully connected layers, often with little or no drop in classification accuracy. However, most of the existing pruning schemes either have to be applied…

Machine Learning · Computer Science 2018-03-13 Konstantinos Pitas , Mike Davies , Pierre Vandergheynst

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks

The rapidly growing parameter volume of deep neural networks (DNNs) hinders the artificial intelligence applications on resource constrained devices, such as mobile and wearable devices. Neural network pruning, as one of the mainstream…

Machine Learning · Computer Science 2019-11-21 Ao Ren , Tao Zhang , Yuhao Wang , Sheng Lin , Peiyan Dong , Yen-kuang Chen , Yuan Xie , Yanzhi Wang

Filter Bank Regularization of Convolutional Neural Networks

Regularization techniques are widely used to improve the generality, robustness, and efficiency of deep convolutional neural networks (DCNNs). In this paper, we propose a novel approach of regulating DCNN convolutional kernels by a…

Machine Learning · Computer Science 2019-11-28 Seyed Mehdi Ayyoubzadeh , Xiaolin Wu