Related papers: Network Pruning for Low-Rank Binary Indexing

Neural Network Compression via Effective Filter Analysis and Hierarchical Pruning

Network compression is crucial to making the deep networks to be more efficient, faster, and generalizable to low-end hardware. Current network compression methods have two open problems: first, there lacks a theoretical framework to…

Machine Learning · Computer Science 2022-06-09 Ziqi Zhou , Li Lian , Yilong Yin , Ze Wang

A "Network Pruning Network" Approach to Deep Model Compression

We present a filter pruning approach for deep model compression, using a multitask network. Our approach is based on learning a a pruner network to prune a pre-trained target network. The pruner is essentially a multitask deep neural…

Computer Vision and Pattern Recognition · Computer Science 2020-01-17 Vinay Kumar Verma , Pravendra Singh , Vinay P. Namboodiri , Piyush Rai

To prune, or not to prune: exploring the efficacy of pruning for model compression

Model pruning seeks to induce sparsity in a deep neural network's various connection matrices, thereby reducing the number of nonzero-valued parameters in the model. Recent reports (Han et al., 2015; Narang et al., 2017) prune deep networks…

Machine Learning · Statistics 2017-11-15 Michael Zhu , Suyog Gupta

ThinResNet: A New Baseline for Structured Convolutional Networks Pruning

Pruning is a compression method which aims to improve the efficiency of neural networks by reducing their number of parameters while maintaining a good performance, thus enhancing the performance-to-cost ratio in nontrivial ways. Of…

Neural and Evolutionary Computing · Computer Science 2023-09-25 Hugo Tessier , Ghouti Boukli Hacene , Vincent Gripon

Effective Network Compression Using Simulation-Guided Iterative Pruning

Existing high-performance deep learning models require very intensive computing. For this reason, it is difficult to embed a deep learning model into a system with limited resources. In this paper, we propose the novel idea of the network…

Machine Learning · Computer Science 2019-02-13 Dae-Woong Jeong , Jaehun Kim , Youngseok Kim , Tae-Ho Kim , Myungsu Chae

Low-Rank Prune-And-Factorize for Language Model Compression

The components underpinning PLMs -- large weight matrices -- were shown to bear considerable redundancy. Matrix factorization, a well-established technique from matrix theory, has been utilized to reduce the number of parameters in PLM.…

Computation and Language · Computer Science 2023-06-27 Siyu Ren , Kenny Q. Zhu

Paying more attention to snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation

Network pruning is one of the most dominant methods for reducing the heavy inference cost of deep neural networks. Existing methods often iteratively prune networks to attain high compression ratio without incurring significant loss in…

Computer Vision and Pattern Recognition · Computer Science 2020-08-17 Duong H. Le , Trung-Nhan Vo , Nam Thoai

C2S2: Cost-aware Channel Sparse Selection for Progressive Network Pruning

This paper describes a channel-selection approach for simplifying deep neural networks. Specifically, we propose a new type of generic network layer, called pruning layer, to seamlessly augment a given pre-trained model for compression.…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Chih-Yao Chiu , Hwann-Tzong Chen , Tyng-Luh Liu

Data-Efficient Structured Pruning via Submodular Optimization

Structured pruning is an effective approach for compressing large pre-trained neural networks without significantly affecting their performance. However, most current structured pruning methods do not provide any performance guarantees, and…

Machine Learning · Computer Science 2023-02-14 Marwa El Halabi , Suraj Srinivas , Simon Lacoste-Julien

Compact Neural Representation Using Attentive Network Pruning

Deep neural networks have evolved to become power demanding and consequently difficult to apply to small-size mobile platforms. Network parameter reduction methods have been introduced to systematically deal with the computational and…

Computer Vision and Pattern Recognition · Computer Science 2020-05-12 Mahdi Biparva , John Tsotsos

MINT: Deep Network Compression via Mutual Information-based Neuron Trimming

Most approaches to deep neural network compression via pruning either evaluate a filter's importance using its weights or optimize an alternative objective function with sparsity constraints. While these methods offer a useful way to…

Machine Learning · Computer Science 2020-03-20 Madan Ravi Ganesh , Jason J. Corso , Salimeh Yasaei Sekeh

Structured Deep Neural Network Pruning via Matrix Pivoting

Deep Neural Networks (DNNs) are the key to the state-of-the-art machine vision, sensor fusion and audio/video signal processing. Unfortunately, their computation complexity and tight resource constraints on the Edge make them hard to…

Machine Learning · Computer Science 2017-12-05 Ranko Sredojevic , Shaoyi Cheng , Lazar Supic , Rawan Naous , Vladimir Stojanovic

Structured Pruning of Deep Convolutional Neural Networks

Real time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular…

Neural and Evolutionary Computing · Computer Science 2015-12-31 Sajid Anwar , Kyuyeon Hwang , Wonyong Sung

Neural Network Compression by Joint Sparsity Promotion and Redundancy Reduction

Compression of convolutional neural network models has recently been dominated by pruning approaches. A class of previous works focuses solely on pruning the unimportant filters to achieve network compression. Another important direction is…

Computer Vision and Pattern Recognition · Computer Science 2022-10-17 Tariq M. Khan , Syed S. Naqvi , Antonio Robles-Kelly , Erik Meijering

PCAS: Pruning Channels with Attention Statistics for Deep Network Compression

Compression techniques for deep neural networks are important for implementing them on small embedded devices. In particular, channel-pruning is a useful technique for realizing compact networks. However, many conventional methods require…

Machine Learning · Statistics 2021-11-03 Kohei Yamamoto , Kurato Maeno

A Closer Look at Structured Pruning for Neural Network Compression

Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network. However, the efficacy of…

Machine Learning · Statistics 2019-06-10 Elliot J. Crowley , Jack Turner , Amos Storkey , Michael O'Boyle

A Main/Subsidiary Network Framework for Simplifying Binary Neural Network

To reduce memory footprint and run-time latency, techniques such as neural network pruning and binarization have been explored separately. However, it is unclear how to combine the best of the two worlds to get extremely small and efficient…

Computer Vision and Pattern Recognition · Computer Science 2018-12-12 Yinghao Xu , Xin Dong , Yudian Li , Hao Su

Elimination-compensation pruning for fully-connected neural networks

The unmatched ability of Deep Neural Networks in capturing complex patterns in large and noisy datasets is often associated with their large hypothesis space, and consequently to the vast amount of parameters that characterize model…

Machine Learning · Computer Science 2026-02-25 Enrico Ballini , Luca Muscarnera , Alessio Fumagalli , Anna Scotti , Francesco Regazzoni

Pruning and Quantization for Deep Neural Network Acceleration: A Survey

Deep neural networks have been applied in many applications exhibiting extraordinary abilities in the field of computer vision. However, complex network architectures challenge efficient real-time deployment and require significant…

Computer Vision and Pattern Recognition · Computer Science 2021-06-16 Tailin Liang , John Glossner , Lei Wang , Shaobo Shi , Xiaotong Zhang

Multi-loss-aware Channel Pruning of Deep Networks

Channel pruning, which seeks to reduce the model size by removing redundant channels, is a popular solution for deep networks compression. Existing channel pruning methods usually conduct layer-wise channel selection by directly minimizing…

Computer Vision and Pattern Recognition · Computer Science 2019-05-14 Yiming Hu , Siyang Sun , Jianquan Li , Jiagang Zhu , Xingang Wang , Qingyi Gu