Related papers: Pruning Compact ConvNets for Efficient Inference

Neural Network Pruning Through Constrained Reinforcement Learning

Network pruning reduces the size of neural networks by removing (pruning) neurons such that the performance drop is minimal. Traditional pruning approaches focus on designing metrics to quantify the usefulness of a neuron which is often…

Computer Vision and Pattern Recognition · Computer Science 2021-11-01 Shehryar Malik , Muhammad Umair Haider , Omer Iqbal , Murtaza Taj

Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy

Neural network pruning is a popular technique used to reduce the inference costs of modern, potentially overparameterized, networks. Starting from a pre-trained network, the process is as follows: remove redundant parameters, retrain, and…

Machine Learning · Computer Science 2021-03-05 Lucas Liebenwein , Cenk Baykal , Brandon Carter , David Gifford , Daniela Rus

An Experimental Study of the Impact of Pre-training on the Pruning of a Convolutional Neural Network

In recent years, deep neural networks have known a wide success in various application domains. However, they require important computational and memory resources, which severely hinders their deployment, notably on mobile devices or for…

Computer Vision and Pattern Recognition · Computer Science 2021-12-16 Nathan Hubens , Matei Mancas , Bernard Gosselin , Marius Preda , Titus Zaharia

Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference

Efficient machine learning implementations optimized for inference in hardware have wide-ranging benefits, depending on the application, from lower inference latency to higher data throughput and reduced energy consumption. Two popular…

Machine Learning · Computer Science 2021-07-21 Benjamin Hawks , Javier Duarte , Nicholas J. Fraser , Alessandro Pappalardo , Nhan Tran , Yaman Umuroglu

Quantisation and Pruning for Neural Network Compression and Regularisation

Deep neural networks are typically too computationally expensive to run in real-time on consumer-grade hardware and low-powered devices. In this paper, we investigate reducing the computational and memory requirements of neural networks…

Machine Learning · Computer Science 2020-01-15 Kimessha Paupamah , Steven James , Richard Klein

Pruning Convolutional Filters using Batch Bridgeout

State-of-the-art computer vision models are rapidly increasing in capacity, where the number of parameters far exceeds the number required to fit the training set. This results in better optimization and generalization performance. However,…

Machine Learning · Computer Science 2020-09-24 Najeeb Khan , Ian Stavness

Pruning and Quantization Impact on Graph Neural Networks

Graph neural networks (GNNs) are known to operate with high accuracy on learning from graph-structured data, but they suffer from high computational and resource costs. Neural network compression methods are used to reduce the model size…

Machine Learning · Computer Science 2025-10-28 Khatoon Khedri , Reza Rawassizadeh , Qifu Wen , Mehdi Hosseinzadeh

Efficient Inference of CNNs via Channel Pruning

The deployment of Convolutional Neural Networks (CNNs) on resource constrained platforms such as mobile devices and embedded systems has been greatly hindered by their high implementation cost, and thus motivated a lot research interest in…

Computer Vision and Pattern Recognition · Computer Science 2019-08-12 Boyu Zhang , Azadeh Davoodi , Yu Hen Hu

ThinResNet: A New Baseline for Structured Convolutional Networks Pruning

Pruning is a compression method which aims to improve the efficiency of neural networks by reducing their number of parameters while maintaining a good performance, thus enhancing the performance-to-cost ratio in nontrivial ways. Of…

Neural and Evolutionary Computing · Computer Science 2023-09-25 Hugo Tessier , Ghouti Boukli Hacene , Vincent Gripon

Pruning Filters for Efficient ConvNets

The success of CNNs in various applications is accompanied by a significant increase in the computation and parameter storage costs. Recent efforts toward reducing these overheads involve pruning and compressing the weights of various…

Computer Vision and Pattern Recognition · Computer Science 2017-03-13 Hao Li , Asim Kadav , Igor Durdanovic , Hanan Samet , Hans Peter Graf

Confident magnitude-based neural network pruning

Pruning neural networks has proven to be a successful approach to increase the efficiency and reduce the memory storage of deep learning models without compromising performance. Previous literature has shown that it is possible to achieve a…

Machine Learning · Computer Science 2024-08-12 Joaquin Alvarez

Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective

The rapid development of large-scale deep learning models questions the affordability of hardware platforms, which necessitates the pruning to reduce their computational and memory footprints. Sparse neural networks as the product, have…

Computer Vision and Pattern Recognition · Computer Science 2024-09-09 Can Jin , Tianjin Huang , Yihua Zhang , Mykola Pechenizkiy , Sijia Liu , Shiwei Liu , Tianlong Chen

Group Fisher Pruning for Practical Network Compression

Network compression has been widely studied since it is able to reduce the memory and computation cost during inference. However, previous methods seldom deal with complicated structures like residual connections, group/depth-wise…

Computer Vision and Pattern Recognition · Computer Science 2021-08-03 Liyang Liu , Shilong Zhang , Zhanghui Kuang , Aojun Zhou , Jing-Hao Xue , Xinjiang Wang , Yimin Chen , Wenming Yang , Qingmin Liao , Wayne Zhang

To prune, or not to prune: exploring the efficacy of pruning for model compression

Model pruning seeks to induce sparsity in a deep neural network's various connection matrices, thereby reducing the number of nonzero-valued parameters in the model. Recent reports (Han et al., 2015; Narang et al., 2017) prune deep networks…

Machine Learning · Statistics 2017-11-15 Michael Zhu , Suyog Gupta

On the Effect of Pruning on Adversarial Robustness

Pruning is a well-known mechanism for reducing the computational cost of deep convolutional networks. However, studies have shown the potential of pruning as a form of regularization, which reduces overfitting and improves generalization.…

Computer Vision and Pattern Recognition · Computer Science 2021-11-29 Artur Jordao , Helio Pedrini

Compressing CNN models for resource-constrained systems by channel and layer pruning

Convolutional Neural Networks (CNNs) have achieved significant breakthroughs in various fields. However, these advancements have led to a substantial increase in the complexity and size of these networks. This poses a challenge when…

Machine Learning · Computer Science 2025-09-11 Ahmed Sadaqa , Di Liu

Exploring Vision Neural Network Pruning via Screening Methodology

The remarkable performance of modern deep neural networks (DNNs) is largely driven by their massive scale, often comprising tens to hundreds of millions-or even billions-of parameters. However, such a scale incurs substantial storage and…

Machine Learning · Computer Science 2026-05-01 Mingyuan Wang , Yangzi Guo , Sida Liu , Yuhang Liu

Performance Aware Convolutional Neural Network Channel Pruning for Embedded GPUs

Convolutional Neural Networks (CNN) are becoming a common presence in many applications and services, due to their superior recognition accuracy. They are increasingly being used on mobile devices, many times just by porting large models…

Machine Learning · Computer Science 2020-02-21 Valentin Radu , Kuba Kaszyk , Yuan Wen , Jack Turner , Jose Cano , Elliot J. Crowley , Bjorn Franke , Amos Storkey , Michael O'Boyle

Dissecting Pruned Neural Networks

Pruning is a standard technique for removing unnecessary structure from a neural network to reduce its storage footprint, computational demands, or energy consumption. Pruning can reduce the parameter-counts of many state-of-the-art neural…

Machine Learning · Computer Science 2019-07-02 Jonathan Frankle , David Bau

PruneTrain: Fast Neural Network Training by Dynamic Sparse Model Reconfiguration

State-of-the-art convolutional neural networks (CNNs) used in vision applications have large models with numerous weights. Training these models is very compute- and memory-resource intensive. Much research has been done on pruning or…

Machine Learning · Computer Science 2019-12-10 Sangkug Lym , Esha Choukse , Siavash Zangeneh , Wei Wen , Sujay Sanghavi , Mattan Erez