English
Related papers

Related papers: Learned Threshold Pruning

200 papers

Deploying transformer models in practice is challenging due to their inference cost, which scales quadratically with input sequence length. To address this, we present a novel Learned Token Pruning (LTP) method which adaptively removes…

Computation and Language · Computer Science 2022-06-06 Sehoon Kim , Sheng Shen , David Thorsley , Amir Gholami , Woosuk Kwon , Joseph Hassoun , Kurt Keutzer

Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Pucheng Zhai , Kailing Guo , Fang Liu , Xiaofen Xing , Xiangmin Xu

Network pruning has been the driving force for the acceleration of neural networks and the alleviation of model storage/transmission burden. With the advent of AutoML and neural architecture search (NAS), pruning has become topical with…

Computer Vision and Pattern Recognition · Computer Science 2020-08-04 Yawei Li , Shuhang Gu , Kai Zhang , Luc Van Gool , Radu Timofte

In this work we present a method to improve the pruning step of the current state-of-the-art methodology to compress neural networks. The novelty of the proposed pruning technique is in its differentiability, which allows pruning to be…

Computer Vision and Pattern Recognition · Computer Science 2019-01-08 Franco Manessi , Alessandro Rozza , Simone Bianco , Paolo Napoletano , Raimondo Schettini

Channel pruning is one of the predominant approaches for accelerating deep neural networks. Most existing pruning methods either train from scratch with a sparsity inducing term such as group lasso, or prune redundant channels in a…

Machine Learning · Computer Science 2020-05-25 Ashish Khetan , Zohar Karnin

In this paper, we propose a novel meta learning approach for automatic channel pruning of very deep neural networks. We first train a PruningNet, a kind of meta network, which is able to generate weight parameters for any pruned structure…

Computer Vision and Pattern Recognition · Computer Science 2019-08-15 Zechun Liu , Haoyuan Mu , Xiangyu Zhang , Zichao Guo , Xin Yang , Tim Kwang-Ting Cheng , Jian Sun

The most common method for DNN pruning is hard thresholding of network weights, followed by retraining to recover any lost accuracy. Recently developed smart pruning algorithms use the DNN response over the training set for a variety of…

Machine Learning · Computer Science 2019-05-23 Konstantinos Pitas , Mike Davies , Pierre Vandergheynst

Structured pruning is a well-established technique for compressing neural networks, making it suitable for deployment in resource-limited edge devices. This paper presents an efficient Loss-Aware Automatic Selection of Structured Pruning…

Computer Vision and Pattern Recognition · Computer Science 2025-06-26 Deepak Ghimire , Kilho Lee , Seong-heum Kim

Network pruning is an effective method to reduce the computational expense of over-parameterized neural networks for deployment on low-resource systems. Recent state-of-the-art techniques for retraining pruned networks such as weight…

Machine Learning · Computer Science 2021-05-10 Duong H. Le , Binh-Son Hua

Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network. However, the efficacy of…

Machine Learning · Statistics 2019-06-10 Elliot J. Crowley , Jack Turner , Amos Storkey , Michael O'Boyle

Modern deep neural networks require a significant amount of computing time and power to train and deploy, which limits their usage on edge devices. Inspired by the iterative weight pruning in the Lottery Ticket Hypothesis, we propose…

Machine Learning · Computer Science 2022-07-15 John Tan Chong Min , Mehul Motani

Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a two-stage learning procedure that first learns a large net and then prunes away…

Machine Learning · Computer Science 2019-09-10 Aidan N. Gomez , Ivan Zhang , Siddhartha Rao Kamalakara , Divyam Madaan , Kevin Swersky , Yarin Gal , Geoffrey E. Hinton

Deep neural networks (DNNs) have become a widely deployed model for numerous machine learning applications. However, their fixed architecture, substantial training cost, and significant model redundancy make it difficult to efficiently…

Neural and Evolutionary Computing · Computer Science 2019-05-28 Xiaoliang Dai , Hongxu Yin , Niraj K. Jha

Pruning neural networks has regained interest in recent years as a means to compress state-of-the-art deep neural networks and enable their deployment on resource-constrained devices. In this paper, we propose a robust compressive learning…

Machine Learning · Computer Science 2020-06-05 George Retsinas , Athena Elafrou , Georgios Goumas , Petros Maragos

Operating deep neural networks (DNNs) on devices with limited resources requires the reduction of their memory as well as computational footprint. Popular reduction methods are network quantization or pruning, which either reduce the word…

Pruning is an effective method to reduce the memory footprint and computational cost associated with large natural language processing models. However, current pruning algorithms either only focus on one pruning category, e.g., structured…

Computation and Language · Computer Science 2022-05-24 Zhewei Yao , Xiaoxia Wu , Linjian Ma , Sheng Shen , Kurt Keutzer , Michael W. Mahoney , Yuxiong He

The deployment of large language models (LLMs) is largely hindered by their large number of parameters. Structural pruning has emerged as a promising solution. Prior structured pruning methods directly remove unimportant parameters based on…

Machine Learning · Computer Science 2026-04-21 Mingkuan Feng , Jinyang Wu , Siyuan Liu , Shuai Zhang , Hongjian Fang , Ruihan Jin , Feihu Che , Pengpeng Shao , Zhengqi Wen , Jianhua Tao

We present a post-training weight pruning method for deep neural networks that achieves accuracy levels tolerable for the production setting and that is sufficiently fast to be run on commodity hardware such as desktop CPUs or edge devices.…

Computer Vision and Pattern Recognition · Computer Science 2021-05-03 Ivan Lazarevich , Alexander Kozlov , Nikita Malinin

Pruning before training enables the deployment of neural networks on smart devices. By retaining weights conducive to generalization, pruned networks can be accommodated on resource-constrained smart devices. It is commonly held that the…

Machine Learning · Computer Science 2025-09-16 Jinying Xiao , Ping Li , Zhe Tang , Jie Nie

Deep Neural Networks have been used in a wide variety of applications with significant success. However, their highly complex nature owing to comprising millions of parameters has lead to problems during deployment in pipelines with low…

Machine Learning · Computer Science 2022-08-15 Elvis Johnson , Xiaochen Tang , Sriramacharyulu Samudrala
‹ Prev 1 2 3 10 Next ›