Related papers: BMRS: Bayesian Model Reduction for Structured Prun…

Pruning a neural network using Bayesian inference

Neural network pruning is a highly effective technique aimed at reducing the computational and memory demands of large neural networks. In this research paper, we present a novel approach to pruning neural networks utilizing Bayesian…

Machine Learning · Statistics 2023-08-07 Sunil Mathew , Daniel B. Rowe

Principled Pruning of Bayesian Neural Networks through Variational Free Energy Minimization

Bayesian model reduction provides an efficient approach for comparing the performance of all nested sub-models of a model, without re-evaluating any of these sub-models. Until now, Bayesian model reduction has been applied mainly in the…

Machine Learning · Computer Science 2024-10-15 Jim Beckers , Bart van Erp , Ziyue Zhao , Kirill Kondrashov , Bert de Vries

Data-Efficient Structured Pruning via Submodular Optimization

Structured pruning is an effective approach for compressing large pre-trained neural networks without significantly affecting their performance. However, most current structured pruning methods do not provide any performance guarantees, and…

Machine Learning · Computer Science 2023-02-14 Marwa El Halabi , Suraj Srinivas , Simon Lacoste-Julien

A Closer Look at Structured Pruning for Neural Network Compression

Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network. However, the efficacy of…

Machine Learning · Statistics 2019-06-10 Elliot J. Crowley , Jack Turner , Amos Storkey , Michael O'Boyle

ThinResNet: A New Baseline for Structured Convolutional Networks Pruning

Pruning is a compression method which aims to improve the efficiency of neural networks by reducing their number of parameters while maintaining a good performance, thus enhancing the performance-to-cost ratio in nontrivial ways. Of…

Neural and Evolutionary Computing · Computer Science 2023-09-25 Hugo Tessier , Ghouti Boukli Hacene , Vincent Gripon

Bayesian sparsification for deep neural networks with Bayesian model reduction

Deep learning's immense capabilities are often constrained by the complexity of its models, leading to an increasing demand for effective sparsification techniques. Bayesian sparsification for deep learning emerges as a crucial approach,…

Machine Learning · Statistics 2024-07-30 Dimitrije Marković , Karl J. Friston , Stefan J. Kiebel

Structured Bayesian Pruning via Log-Normal Multiplicative Noise

Dropout-based regularization methods can be regarded as injecting random noise with pre-defined magnitude to different parts of the neural network during training. It was recently shown that Bayesian dropout procedure not only improves…

Machine Learning · Statistics 2017-11-07 Kirill Neklyudov , Dmitry Molchanov , Arsenii Ashukha , Dmitry Vetrov

Structured Pruning of Neural Networks with Budget-Aware Regularization

Pruning methods have shown to be effective at reducing the size of deep neural networks while keeping accuracy almost intact. Among the most effective methods are those that prune a network while training it with a sparsity prior loss and…

Neural and Evolutionary Computing · Computer Science 2019-12-20 Carl Lemaire , Andrew Achkar , Pierre-Marc Jodoin

Structured Bayesian Compression for Deep Neural Networks Based on The Turbo-VBI Approach

With the growth of neural network size, model compression has attracted increasing interest in recent research. As one of the most common techniques, pruning has been studied for a long time. By exploiting the structured sparsity of the…

Machine Learning · Computer Science 2023-04-12 Chengyu Xia , Danny H. K. Tsang , Vincent K. N. Lau

Bayesian Neural Networks at Scale: A Performance Analysis and Pruning Study

Bayesian neural Networks (BNNs) are a promising method of obtaining statistical uncertainties for neural network predictions but with a higher computational overhead which can limit their practical usage. This work explores the use of high…

Machine Learning · Computer Science 2020-09-09 Himanshu Sharma , Elise Jennings

Leveraging Structured Pruning of Convolutional Neural Networks

Structured pruning is a popular method to reduce the cost of convolutional neural networks, that are the state of the art in many computer vision tasks. However, depending on the architecture, pruning introduces dimensional discrepancies…

Neural and Evolutionary Computing · Computer Science 2022-12-13 Hugo Tessier , Vincent Gripon , Mathieu Léonardon , Matthieu Arzel , David Bertrand , Thomas Hannagan

Efficient Model Compression for Bayesian Neural Networks

Model Compression has drawn much attention within the deep learning community recently. Compressing a dense neural network offers many advantages including lower computation cost, deployability to devices of limited storage and memories,…

Machine Learning · Computer Science 2024-11-04 Diptarka Saha , Zihe Liu , Feng Liang

Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes

Structured pruning is a promising approach to create smaller, faster large language models. However, existing methods typically rely on computing the gradient via backward passes, which can inflate memory requirements and compute costs. In…

Machine Learning · Computer Science 2026-01-23 Steven Kolawole , Lucio Dery , Jean-François Kagy , Virginia Smith , Graham Neubig , Ameet Talwalkar

Automated Pruning for Deep Neural Network Compression

In this work we present a method to improve the pruning step of the current state-of-the-art methodology to compress neural networks. The novelty of the proposed pruning technique is in its differentiability, which allows pruning to be…

Computer Vision and Pattern Recognition · Computer Science 2019-01-08 Franco Manessi , Alessandro Rozza , Simone Bianco , Paolo Napoletano , Raimondo Schettini

DReSS: Data-driven Regularized Structured Streamlining for Large Language Models

Large language models (LLMs) have achieved significant progress across various domains, but their increasing scale results in high computational and memory costs. Recent studies have revealed that LLMs exhibit sparsity, providing the…

Machine Learning · Computer Science 2025-07-01 Mingkuan Feng , Jinyang Wu , Shuai Zhang , Pengpeng Shao , Ruihan Jin , Zhengqi Wen , Jianhua Tao , Feihu Che

Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy

Neural network pruning is a popular technique used to reduce the inference costs of modern, potentially overparameterized, networks. Starting from a pre-trained network, the process is as follows: remove redundant parameters, retrain, and…

Machine Learning · Computer Science 2021-03-05 Lucas Liebenwein , Cenk Baykal , Brandon Carter , David Gifford , Daniela Rus

MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures

The remarkable performance of large language models (LLMs) in various language tasks has attracted considerable attention. However, the ever-increasing size of these models presents growing challenges for deployment and inference.…

Computation and Language · Computer Science 2025-02-21 Jiayu Qin , Jianchao Tan , Kefeng Zhang , Xunliang Cai , Wei Wang

Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models

Despite the remarkable success of Large Language Models (LLMs), the massive size poses significant deployment challenges, particularly on resource-constrained hardware. While existing LLM compression methods focus on quantization, pruning…

Artificial Intelligence · Computer Science 2023-10-12 Song Guo , Jiahang Xu , Li Lyna Zhang , Mao Yang

To prune, or not to prune: exploring the efficacy of pruning for model compression

Model pruning seeks to induce sparsity in a deep neural network's various connection matrices, thereby reducing the number of nonzero-valued parameters in the model. Recent reports (Han et al., 2015; Narang et al., 2017) prune deep networks…

Machine Learning · Statistics 2017-11-15 Michael Zhu , Suyog Gupta

Structured Pruning of Recurrent Neural Networks through Neuron Selection

Recurrent neural networks (RNNs) have recently achieved remarkable successes in a number of applications. However, the huge sizes and computational burden of these models make it difficult for their deployment on edge devices. A practically…

Machine Learning · Computer Science 2019-12-10 Liangjian Wen , Xuanyang Zhang , Haoli Bai , Zenglin Xu