Related papers: Automatic Neural Network Compression by Sparsity-Q…

Weight Pruning via Adaptive Sparsity Loss

Pruning neural networks has regained interest in recent years as a means to compress state-of-the-art deep neural networks and enable their deployment on resource-constrained devices. In this paper, we propose a robust compressive learning…

Machine Learning · Computer Science 2020-06-05 George Retsinas , Athena Elafrou , Georgios Goumas , Petros Maragos

Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization

Deep Neural Networks (DNNs) have shown significant advantages in a wide variety of domains. However, DNNs are becoming computationally intensive and energy hungry at an exponential pace, while at the same time, there is a vast demand for…

Machine Learning · Computer Science 2023-12-27 Konstantinos Balaskas , Andreas Karatzas , Christos Sad , Kostas Siozios , Iraklis Anagnostopoulos , Georgios Zervakis , Jörg Henkel

A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks

Deep neural networks (DNNs) have achieved significant success in a variety of real world applications, i.e., image classification. However, tons of parameters in the networks restrict the efficiency of neural networks due to the large model…

Machine Learning · Computer Science 2019-08-21 Yuzhe Ma , Ran Chen , Wei Li , Fanhua Shang , Wenjian Yu , Minsik Cho , Bei Yu

A Unified DNN Weight Compression Framework Using Reweighted Optimization Methods

To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i.e., static regularization-based pruning and dynamic…

Machine Learning · Computer Science 2020-04-14 Tianyun Zhang , Xiaolong Ma , Zheng Zhan , Shanglin Zhou , Minghai Qin , Fei Sun , Yen-Kuang Chen , Caiwen Ding , Makan Fardad , Yanzhi Wang

A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM

Many model compression techniques of Deep Neural Networks (DNNs) have been investigated, including weight pruning, weight clustering and quantization, etc. Weight pruning leverages the redundancy in the number of weights in DNNs, while…

Neural and Evolutionary Computing · Computer Science 2018-11-06 Shaokai Ye , Tianyun Zhang , Kaiqi Zhang , Jiayu Li , Jiaming Xie , Yun Liang , Sijia Liu , Xue Lin , Yanzhi Wang

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

Weight pruning and weight quantization are two important categories of DNN model compression. Prior work on these techniques are mainly based on heuristics. A recent work developed a systematic frame-work of DNN weight pruning using the…

Neural and Evolutionary Computing · Computer Science 2019-04-02 Shaokai Ye , Xiaoyu Feng , Tianyun Zhang , Xiaolong Ma , Sheng Lin , Zhengang Li , Kaidi Xu , Wujie Wen , Sijia Liu , Jian Tang , Makan Fardad , Xue Lin , Yongpan Liu , Yanzhi Wang

Towards Explaining Deep Neural Network Compression Through a Probabilistic Latent Space

Despite the impressive performance of deep neural networks (DNNs), their computational complexity and storage space consumption have led to the concept of network compression. While DNN compression techniques such as pruning and low-rank…

Machine Learning · Computer Science 2025-07-04 Mahsa Mozafari-Nia , Salimeh Yasaei Sekeh

Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression

This paper investigates deep neural network (DNN) compression from the perspective of compactly representing and storing trained parameters. We explore the previously overlooked opportunity of cross-layer architecture-agnostic…

Computer Vision and Pattern Recognition · Computer Science 2021-11-22 Yuezhou Sun , Wenlong Zhao , Lijun Zhang , Xiao Liu , Hui Guan , Matei Zaharia

ResNet Can Be Pruned 60x: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning

The state-of-art DNN structures involve high computation and great demand for memory storage which pose intensive challenge on DNN framework resources. To mitigate the challenges, weight pruning techniques has been studied. However, high…

Machine Learning · Computer Science 2019-05-03 Xiaolong Ma , Geng Yuan , Sheng Lin , Zhengang Li , Hao Sun , Yanzhi Wang

A Framework For Pruning Deep Neural Networks Using Energy-Based Models

A typical deep neural network (DNN) has a large number of trainable parameters. Choosing a network with proper capacity is challenging and generally a larger network with excessive capacity is trained. Pruning is an established approach to…

Neural and Evolutionary Computing · Computer Science 2021-03-01 Hojjat Salehinejad , Shahrokh Valaee

WeightMom: Learning Sparse Networks using Iterative Momentum-based pruning

Deep Neural Networks have been used in a wide variety of applications with significant success. However, their highly complex nature owing to comprising millions of parameters has lead to problems during deployment in pipelines with low…

Machine Learning · Computer Science 2022-08-15 Elvis Johnson , Xiaochen Tang , Sriramacharyulu Samudrala

Universal Deep Neural Network Compression

In this paper, we investigate lossy compression of deep neural networks (DNNs) by weight quantization and lossless source coding for memory-efficient deployment. Whereas the previous work addressed non-universal scalar quantization and…

Computer Vision and Pattern Recognition · Computer Science 2019-02-22 Yoojin Choi , Mostafa El-Khamy , Jungwon Lee

Compression of Deep Convolutional Neural Networks under Joint Sparsity Constraints

We consider the optimization of deep convolutional neural networks (CNNs) such that they provide good performance while having reduced complexity if deployed on either conventional systems utilizing spatial-domain convolution or lower…

Computer Vision and Pattern Recognition · Computer Science 2018-10-30 Yoojin Choi , Mostafa El-Khamy , Jungwon Lee

Joint Pruning and Channel-wise Mixed-Precision Quantization for Efficient Deep Neural Networks

The resource requirements of deep neural networks (DNNs) pose significant challenges to their deployment on edge devices. Common approaches to address this issue are pruning and mixed-precision quantization, which lead to latency and memory…

Machine Learning · Computer Science 2024-09-25 Beatrice Alessandra Motetti , Matteo Risso , Alessio Burrello , Enrico Macii , Massimo Poncino , Daniele Jahier Pagliari

Network Automatic Pruning: Start NAP and Take a Nap

Network pruning can significantly reduce the computation and memory footprint of large neural networks. To achieve a good trade-off between model size and performance, popular pruning techniques usually rely on hand-crafted heuristics and…

Computer Vision and Pattern Recognition · Computer Science 2021-01-19 Wenyuan Zeng , Yuwen Xiong , Raquel Urtasun

Quantisation and Pruning for Neural Network Compression and Regularisation

Deep neural networks are typically too computationally expensive to run in real-time on consumer-grade hardware and low-powered devices. In this paper, we investigate reducing the computational and memory requirements of neural networks…

Machine Learning · Computer Science 2020-01-15 Kimessha Paupamah , Steven James , Richard Klein

A Programmable Approach to Neural Network Compression

Deep neural networks (DNNs) frequently contain far more weights, represented at a higher precision, than are required for the specific task which they are trained to perform. Consequently, they can often be compressed using techniques such…

Machine Learning · Computer Science 2020-12-03 Vinu Joseph , Saurav Muralidharan , Animesh Garg , Michael Garland , Ganesh Gopalakrishnan

Integrating Pruning with Quantization for Efficient Deep Neural Networks Compression

Deep Neural Networks (DNNs) have achieved significant advances in a wide range of applications. However, their deployment on resource-constrained devices remains a challenge due to the large number of layers and parameters, which result in…

Neural and Evolutionary Computing · Computer Science 2025-09-05 Sara Makenali , Babak Rokh , Ali Azarpeyvand

Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep Neural Networks

In this paper, we propose a novel layer-adaptive weight-pruning approach for Deep Neural Networks (DNNs) that addresses the challenge of optimizing the output distortion minimization while adhering to a target pruning ratio constraint. Our…

Computer Vision and Pattern Recognition · Computer Science 2023-08-25 Kaixin Xu , Zhe Wang , Xue Geng , Jie Lin , Min Wu , Xiaoli Li , Weisi Lin

Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets

Deep neural nets (DNNs) compression is crucial for adaptation to mobile devices. Though many successful algorithms exist to compress naturally trained DNNs, developing efficient and stable compression algorithms for robustly trained DNNs…

Machine Learning · Computer Science 2020-03-03 Thu Dinh , Bao Wang , Andrea L. Bertozzi , Stanley J. Osher