Related papers: Learned Threshold Pruning

Learned Token Pruning for Transformers

Deploying transformer models in practice is challenging due to their inference cost, which scales quadratically with input sequence length. To address this, we present a novel Learned Token Pruning (LTP) method which adaptively removes…

Computation and Language · Computer Science 2022-06-06 Sehoon Kim , Sheng Shen , David Thorsley , Amir Gholami , Woosuk Kwon , Joseph Hassoun , Kurt Keutzer

LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Pucheng Zhai , Kailing Guo , Fang Liu , Xiaofen Xing , Xiangmin Xu

DHP: Differentiable Meta Pruning via HyperNetworks

Network pruning has been the driving force for the acceleration of neural networks and the alleviation of model storage/transmission burden. With the advent of AutoML and neural architecture search (NAS), pruning has become topical with…

Computer Vision and Pattern Recognition · Computer Science 2020-08-04 Yawei Li , Shuhang Gu , Kai Zhang , Luc Van Gool , Radu Timofte

Automated Pruning for Deep Neural Network Compression

In this work we present a method to improve the pruning step of the current state-of-the-art methodology to compress neural networks. The novelty of the proposed pruning technique is in its differentiability, which allows pruning to be…

Computer Vision and Pattern Recognition · Computer Science 2019-01-08 Franco Manessi , Alessandro Rozza , Simone Bianco , Paolo Napoletano , Raimondo Schettini

PruneNet: Channel Pruning via Global Importance

Channel pruning is one of the predominant approaches for accelerating deep neural networks. Most existing pruning methods either train from scratch with a sparsity inducing term such as group lasso, or prune redundant channels in a…

Machine Learning · Computer Science 2020-05-25 Ashish Khetan , Zohar Karnin

MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

In this paper, we propose a novel meta learning approach for automatic channel pruning of very deep neural networks. We first train a PruningNet, a kind of meta network, which is able to generate weight parameters for any pruned structure…

Computer Vision and Pattern Recognition · Computer Science 2019-08-15 Zechun Liu , Haoyuan Mu , Xiangyu Zhang , Zichao Guo , Xin Yang , Tim Kwang-Ting Cheng , Jian Sun

Revisiting hard thresholding for DNN pruning

The most common method for DNN pruning is hard thresholding of network weights, followed by retraining to recover any lost accuracy. Recently developed smart pruning algorithms use the DNN response over the training set for a variety of…

Machine Learning · Computer Science 2019-05-23 Konstantinos Pitas , Mike Davies , Pierre Vandergheynst

Loss-Aware Automatic Selection of Structured Pruning Criteria for Deep Neural Network Acceleration

Structured pruning is a well-established technique for compressing neural networks, making it suitable for deployment in resource-limited edge devices. This paper presents an efficient Loss-Aware Automatic Selection of Structured Pruning…

Computer Vision and Pattern Recognition · Computer Science 2025-06-26 Deepak Ghimire , Kilho Lee , Seong-heum Kim

Network Pruning That Matters: A Case Study on Retraining Variants

Network pruning is an effective method to reduce the computational expense of over-parameterized neural networks for deployment on low-resource systems. Recent state-of-the-art techniques for retraining pruned networks such as weight…

Machine Learning · Computer Science 2021-05-10 Duong H. Le , Binh-Son Hua

A Closer Look at Structured Pruning for Neural Network Compression

Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network. However, the efficacy of…

Machine Learning · Statistics 2019-06-10 Elliot J. Crowley , Jack Turner , Amos Storkey , Michael O'Boyle

DropNet: Reducing Neural Network Complexity via Iterative Pruning

Modern deep neural networks require a significant amount of computing time and power to train and deploy, which limits their usage on edge devices. Inspired by the iterative weight pruning in the Lottery Ticket Hypothesis, we propose…

Machine Learning · Computer Science 2022-07-15 John Tan Chong Min , Mehul Motani

Learning Sparse Networks Using Targeted Dropout

Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a two-stage learning procedure that first learns a large net and then prunes away…

Machine Learning · Computer Science 2019-09-10 Aidan N. Gomez , Ivan Zhang , Siddhartha Rao Kamalakara , Divyam Madaan , Kevin Swersky , Yarin Gal , Geoffrey E. Hinton

Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks

Deep neural networks (DNNs) have become a widely deployed model for numerous machine learning applications. However, their fixed architecture, substantial training cost, and significant model redundancy make it difficult to efficiently…

Neural and Evolutionary Computing · Computer Science 2019-05-28 Xiaoliang Dai , Hongxu Yin , Niraj K. Jha

Weight Pruning via Adaptive Sparsity Loss

Pruning neural networks has regained interest in recent years as a means to compress state-of-the-art deep neural networks and enable their deployment on resource-constrained devices. In this paper, we propose a robust compressive learning…

Machine Learning · Computer Science 2020-06-05 George Retsinas , Athena Elafrou , Georgios Goumas , Petros Maragos

Iteratively Training Look-Up Tables for Network Quantization

Operating deep neural networks (DNNs) on devices with limited resources requires the reduction of their memory as well as computational footprint. Popular reduction methods are network quantization or pruning, which either reduce the word…

Machine Learning · Computer Science 2023-07-19 Fabien Cardinaux , Stefan Uhlich , Kazuki Yoshiyama , Javier Alonso Garcia , Lukas Mauch , Stephen Tiedemann , Thomas Kemp , Akira Nakamura

LEAP: Learnable Pruning for Transformer-based Models

Pruning is an effective method to reduce the memory footprint and computational cost associated with large natural language processing models. However, current pruning algorithms either only focus on one pruning category, e.g., structured…

Computation and Language · Computer Science 2022-05-24 Zhewei Yao , Xiaoxia Wu , Linjian Ma , Sheng Shen , Kurt Keutzer , Michael W. Mahoney , Yuxiong He

Two-Stage Regularization-Based Structured Pruning for LLMs

The deployment of large language models (LLMs) is largely hindered by their large number of parameters. Structural pruning has emerged as a promising solution. Prior structured pruning methods directly remove unimportant parameters based on…

Machine Learning · Computer Science 2026-04-21 Mingkuan Feng , Jinyang Wu , Siyuan Liu , Shuai Zhang , Hongjian Fang , Ruihan Jin , Feihu Che , Pengpeng Shao , Zhengqi Wen , Jianhua Tao

Post-training deep neural network pruning via layer-wise calibration

We present a post-training weight pruning method for deep neural networks that achieves accuracy levels tolerable for the production setting and that is sufficiently fast to be run on commodity hardware such as desktop CPUs or edge devices.…

Computer Vision and Pattern Recognition · Computer Science 2021-05-03 Ivan Lazarevich , Alexander Kozlov , Nikita Malinin

LNPT: Label-free Network Pruning and Training

Pruning before training enables the deployment of neural networks on smart devices. By retaining weights conducive to generalization, pruned networks can be accommodated on resource-constrained smart devices. It is commonly held that the…

Machine Learning · Computer Science 2025-09-16 Jinying Xiao , Ping Li , Zhe Tang , Jie Nie

WeightMom: Learning Sparse Networks using Iterative Momentum-based pruning

Deep Neural Networks have been used in a wide variety of applications with significant success. However, their highly complex nature owing to comprising millions of parameters has lead to problems during deployment in pipelines with low…

Machine Learning · Computer Science 2022-08-15 Elvis Johnson , Xiaochen Tang , Sriramacharyulu Samudrala