Related papers: Structural Dropout for Model Width Compression

Reducing Transformer Depth on Demand with Structured Dropout

Overparameterized transformer networks have obtained state of the art results in various natural language processing tasks, such as machine translation, language modeling, and question answering. These models contain hundreds of millions of…

Machine Learning · Computer Science 2019-09-26 Angela Fan , Edouard Grave , Armand Joulin

To prune, or not to prune: exploring the efficacy of pruning for model compression

Model pruning seeks to induce sparsity in a deep neural network's various connection matrices, thereby reducing the number of nonzero-valued parameters in the model. Recent reports (Han et al., 2015; Narang et al., 2017) prune deep networks…

Machine Learning · Statistics 2017-11-15 Michael Zhu , Suyog Gupta

Reducing Training Complexity in Empirical Quadrature-Based Model Reduction via Structured Compression

Model order reduction seeks to approximate large-scale dynamical systems by lower-dimensional reduced models. For linear systems, a small reduced dimension directly translates into low computational cost, ensuring online efficiency. This…

Numerical Analysis · Mathematics 2025-12-17 Björn Liljegren-Sailer

Data-Independent Structured Pruning of Neural Networks via Coresets

Model compression is crucial for deployment of neural networks on devices with limited computational and memory resources. Many different methods show comparable accuracy of the compressed model and similar compression rates. However, the…

Machine Learning · Computer Science 2020-08-21 Ben Mussay , Daniel Feldman , Samson Zhou , Vladimir Braverman , Margarita Osadchy

C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression

Neural network compression has gained increasing attention in recent years, particularly in computer vision applications, where the need for model reduction is crucial for overcoming deployment constraints. Pruning is a widely used…

Computer Vision and Pattern Recognition · Computer Science 2025-10-22 Baptiste Bauvin , Loïc Baret , Ola Ahmad

Structured Pruning of Large Language Models

Large language models have recently achieved state of the art performance across a wide variety of natural language tasks. Meanwhile, the size of these models and their latency have significantly increased, which makes their usage costly,…

Computation and Language · Computer Science 2021-03-30 Ziheng Wang , Jeremy Wohlwend , Tao Lei

DeepTwist: Learning Model Compression via Occasional Weight Distortion

Model compression has been introduced to reduce the required hardware resources while maintaining the model accuracy. Lots of techniques for model compression, such as pruning, quantization, and low-rank approximation, have been suggested…

Machine Learning · Computer Science 2018-10-31 Dongsoo Lee , Parichay Kapoor , Byeongwook Kim

Data-Efficient Structured Pruning via Submodular Optimization

Structured pruning is an effective approach for compressing large pre-trained neural networks without significantly affecting their performance. However, most current structured pruning methods do not provide any performance guarantees, and…

Machine Learning · Computer Science 2023-02-14 Marwa El Halabi , Suraj Srinivas , Simon Lacoste-Julien

Ising-Dropout: A Regularization Method for Training and Compression of Deep Neural Networks

Overfitting is a major problem in training machine learning models, specifically deep neural networks. This problem may be caused by imbalanced datasets and initialization of the model parameters, which conforms the model too closely to the…

Neural and Evolutionary Computing · Computer Science 2019-02-26 Hojjat Salehinejad , Shahrokh Valaee

Revisiting Data Augmentation in Model Compression: An Empirical and Comprehensive Study

The excellent performance of deep neural networks is usually accompanied by a large number of parameters and computations, which have limited their usage on the resource-limited edge devices. To address this issue, abundant methods such as…

Computer Vision and Pattern Recognition · Computer Science 2023-05-23 Muzhou Yu , Linfeng Zhang , Kaisheng Ma

Smooth Model Compression without Fine-Tuning

Compressing and pruning large machine learning models has become a critical step towards their deployment in real-world applications. Standard pruning and compression techniques are typically designed without taking the structure of the…

Machine Learning · Computer Science 2025-06-02 Christina Runkel , Natacha Kuete Meli , Jovita Lukasik , Ander Biguri , Carola-Bibiane Schönlieb , Michael Moeller

Structural Pruning for Diffusion Models

Generative modeling has recently undergone remarkable advancements, primarily propelled by the transformative implications of Diffusion Probabilistic Models (DPMs). The impressive capability of these models, however, often entails…

Machine Learning · Computer Science 2023-10-03 Gongfan Fang , Xinyin Ma , Xinchao Wang

A "Network Pruning Network" Approach to Deep Model Compression

We present a filter pruning approach for deep model compression, using a multitask network. Our approach is based on learning a a pruner network to prune a pre-trained target network. The pruner is essentially a multitask deep neural…

Computer Vision and Pattern Recognition · Computer Science 2020-01-17 Vinay Kumar Verma , Pravendra Singh , Vinay P. Namboodiri , Piyush Rai

Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models

Due to the substantial scale of Large Language Models (LLMs), the direct application of conventional compression methodologies proves impractical. The computational demands associated with even minimal gradient updates present challenges,…

Machine Learning · Computer Science 2023-12-13 Arnav Chavan , Nahush Lele , Deepak Gupta

Revisiting Structured Dropout

Large neural networks are often overparameterised and prone to overfitting, Dropout is a widely used regularization technique to combat overfitting and improve model generalization. However, unstructured Dropout is not always effective for…

Machine Learning · Computer Science 2022-10-07 Yiren Zhao , Oluwatomisin Dada , Xitong Gao , Robert D Mullins

Triangular Dropout: Variable Network Width without Retraining

One of the most fundamental design choices in neural networks is layer width: it affects the capacity of what a network can learn and determines the complexity of the solution. This latter property is often exploited when introducing…

Machine Learning · Computer Science 2022-05-04 Edward W. Staley , Jared Markowitz

A Novel Architecture Slimming Method for Network Pruning and Knowledge Distillation

Network pruning and knowledge distillation are two widely-known model compression methods that efficiently reduce computation cost and model size. A common problem in both pruning and distillation is to determine compressed architecture,…

Computer Vision and Pattern Recognition · Computer Science 2022-02-23 Dongqi Wang , Shengyu Zhang , Zhipeng Di , Xin Lin , Weihua Zhou , Fei Wu

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Deep networks are typically trained with many more parameters than the size of the training dataset. Recent empirical evidence indicates that the practice of overparameterization not only benefits training large models, but also assists -…

Machine Learning · Computer Science 2020-12-17 Xiangyu Chang , Yingcong Li , Samet Oymak , Christos Thrampoulidis

AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates

Structured weight pruning is a representative model compression technique of DNNs to reduce the storage and computation requirements and accelerate inference. An automatic hyperparameter determination process is necessary due to the large…

Machine Learning · Computer Science 2019-09-12 Ning Liu , Xiaolong Ma , Zhiyuan Xu , Yanzhi Wang , Jian Tang , Jieping Ye

Efficient Compression of Overparameterized Deep Models through Low-Dimensional Learning Dynamics

Overparameterized models have proven to be powerful tools for solving various machine learning tasks. However, overparameterization often leads to a substantial increase in computational and memory costs, which in turn requires extensive…

Machine Learning · Computer Science 2024-03-13 Soo Min Kwon , Zekai Zhang , Dogyoon Song , Laura Balzano , Qing Qu