Related papers: MPDCompress - Matrix Permutation Decomposition Alg…

DP-Net: Dynamic Programming Guided Deep Neural Network Compression

In this work, we propose an effective scheme (called DP-Net) for compressing the deep neural networks (DNNs). It includes a novel dynamic programming (DP) based algorithm to obtain the optimal solution of weight quantization and an…

Machine Learning · Computer Science 2020-03-24 Dingcheng Yang , Wenjian Yu , Ao Zhou , Haoyuan Mu , Gary Yao , Xiaoyi Wang

A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks

Deep neural networks (DNNs) have achieved significant success in a variety of real world applications, i.e., image classification. However, tons of parameters in the networks restrict the efficiency of neural networks due to the large model…

Machine Learning · Computer Science 2019-08-21 Yuzhe Ma , Ran Chen , Wei Li , Fanhua Shang , Wenjian Yu , Minsik Cho , Bei Yu

DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression

DNNs have been quickly and broadly exploited to improve the data analysis quality in many complex science and engineering applications. Today's DNNs are becoming deeper and wider because of increasing demand on the analysis quality and more…

Computer Vision and Pattern Recognition · Computer Science 2019-04-24 Sian Jin , Sheng Di , Xin Liang , Jiannan Tian , Dingwen Tao , Franck Cappello

Joint Matrix Decomposition for Deep Convolutional Neural Networks Compression

Deep convolutional neural networks (CNNs) with a large number of parameters require intensive computational resources, and thus are hard to be deployed in resource-constrained platforms. Decomposition-based methods, therefore, have been…

Computer Vision and Pattern Recognition · Computer Science 2022-10-27 Shaowu Chen , Jiahao Zhou , Weize Sun , Lei Huang

A Survey on Deep Neural Network Compression: Challenges, Overview, and Solutions

Deep Neural Network (DNN) has gained unprecedented performance due to its automated feature extraction capability. This high order performance leads to significant incorporation of DNN models in different Internet of Things (IoT)…

Machine Learning · Computer Science 2020-10-09 Rahul Mishra , Hari Prabhat Gupta , Tanima Dutta

Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration

Weight pruning is an effective model compression technique to tackle the challenges of achieving real-time deep neural network (DNN) inference on mobile devices. However, prior pruning schemes have limited application scenarios due to…

Machine Learning · Computer Science 2022-03-29 Yifan Gong , Geng Yuan , Zheng Zhan , Wei Niu , Zhengang Li , Pu Zhao , Yuxuan Cai , Sijia Liu , Bin Ren , Xue Lin , Xulong Tang , Yanzhi Wang

Decomposable-Net: Scalable Low-Rank Compression for Neural Networks

Compressing DNNs is important for the real-world applications operating on resource-constrained devices. However, we typically observe drastic performance deterioration when changing model size after training is completed. Therefore,…

Machine Learning · Computer Science 2021-09-30 Atsushi Yaguchi , Taiji Suzuki , Shuhei Nitta , Yukinobu Sakata , Akiyuki Tanizawa

A Survey of Model Compression and Acceleration for Deep Neural Networks

Deep neural networks (DNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with…

Machine Learning · Computer Science 2020-06-16 Yu Cheng , Duo Wang , Pan Zhou , Tao Zhang

Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression

This paper investigates deep neural network (DNN) compression from the perspective of compactly representing and storing trained parameters. We explore the previously overlooked opportunity of cross-layer architecture-agnostic…

Computer Vision and Pattern Recognition · Computer Science 2021-11-22 Yuezhou Sun , Wenlong Zhao , Lijun Zhang , Xiao Liu , Hui Guan , Matei Zaharia

Neural Network Compression via Effective Filter Analysis and Hierarchical Pruning

Network compression is crucial to making the deep networks to be more efficient, faster, and generalizable to low-end hardware. Current network compression methods have two open problems: first, there lacks a theoretical framework to…

Machine Learning · Computer Science 2022-06-09 Ziqi Zhou , Li Lian , Yilong Yin , Ze Wang

Structured Deep Neural Network Pruning via Matrix Pivoting

Deep Neural Networks (DNNs) are the key to the state-of-the-art machine vision, sensor fusion and audio/video signal processing. Unfortunately, their computation complexity and tight resource constraints on the Edge make them hard to…

Machine Learning · Computer Science 2017-12-05 Ranko Sredojevic , Shaoyi Cheng , Lazar Supic , Rawan Naous , Vladimir Stojanovic

Deep Neural Networks Based Weight Approximation and Computation Reuse for 2-D Image Classification

Deep Neural Networks (DNNs) are computationally and memory intensive, which makes their hardware implementation a challenging task especially for resource constrained devices such as IoT nodes. To address this challenge, this paper…

Computer Vision and Pattern Recognition · Computer Science 2021-05-10 Mohammed F. Tolba , Huruy Tekle Tesfai , Hani Saleh , Baker Mohammad , Mahmoud Al-Qutayri

Transform-Based Feature Map Compression for CNN Inference

To achieve higher accuracy in machine learning tasks, very deep convolutional neural networks (CNNs) are designed recently. However, the large memory access of deep CNNs will lead to high power consumption. A variety of hardware-friendly…

Image and Video Processing · Electrical Eng. & Systems 2021-06-25 Yubo Shi , Meiqi Wang , Siyi Chen , Jinghe Wei , Zhongfeng Wang

Efficient Micro-Structured Weight Unification and Pruning for Neural Network Compression

Compressing Deep Neural Network (DNN) models to alleviate the storage and computation requirements is essential for practical applications, especially for resource limited devices. Although capable of reducing a reasonable amount of model…

Machine Learning · Computer Science 2021-06-17 Sheng Lin , Wei Jiang , Wei Wang , Kaidi Xu , Yanzhi Wang , Shan Liu , Songnan Li

Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization

Deep Neural Networks (DNNs) have shown significant advantages in a wide variety of domains. However, DNNs are becoming computationally intensive and energy hungry at an exponential pace, while at the same time, there is a vast demand for…

Machine Learning · Computer Science 2023-12-27 Konstantinos Balaskas , Andreas Karatzas , Christos Sad , Kostas Siozios , Iraklis Anagnostopoulos , Georgios Zervakis , Jörg Henkel

Run-Time Efficient RNN Compression for Inference on Edge Devices

Recurrent neural networks can be large and compute-intensive, yet many applications that benefit from RNNs run on small devices with very limited compute and storage capabilities while still having run-time constraints. As a result, there…

Machine Learning · Computer Science 2020-08-14 Urmish Thakker , Jesse Beu , Dibakar Gope , Ganesh Dasika , Matthew Mattina

"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach

Modern deep neural networks (DNNs) are extremely powerful; however, this comes at the price of increased depth and having more parameters per layer, making their training and inference more computationally challenging. In an attempt to…

Machine Learning · Statistics 2024-03-04 Lingyu Gu , Yongqi Du , Yuan Zhang , Di Xie , Shiliang Pu , Robert C. Qiu , Zhenyu Liao

Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-based Approach

Deep Neural Networks (DNNs) are applied in a wide range of usecases. There is an increased demand for deploying DNNs on devices that do not have abundant resources such as memory and computation units. Recently, network compression through…

Machine Learning · Computer Science 2020-05-19 Haichuan Yang , Shupeng Gui , Yuhao Zhu , Ji Liu

Progressive Weight Pruning of Deep Neural Networks using ADMM

Deep neural networks (DNNs) although achieving human-level performance in many domains, have very large model size that hinders their broader applications on edge computing devices. Extensive research work have been conducted on DNN model…

Machine Learning · Computer Science 2018-11-06 Shaokai Ye , Tianyun Zhang , Kaiqi Zhang , Jiayu Li , Kaidi Xu , Yunfei Yang , Fuxun Yu , Jian Tang , Makan Fardad , Sijia Liu , Xiang Chen , Xue Lin , Yanzhi Wang

Towards Image Understanding from Deep Compression without Decoding

Motivated by recent work on deep neural network (DNN)-based image compression methods showing potential improvements in image quality, savings in storage, and bandwidth reduction, we propose to perform image understanding tasks such as…

Computer Vision and Pattern Recognition · Computer Science 2018-03-19 Robert Torfason , Fabian Mentzer , Eirikur Agustsson , Michael Tschannen , Radu Timofte , Luc Van Gool