Related papers: APP: Anytime Progressive Pruning

Adaptive Dense-to-Sparse Paradigm for Pruning Online Recommendation System with Non-Stationary Data

Large scale deep learning provides a tremendous opportunity to improve the quality of content recommendation systems by employing both wider and deeper models, but this comes at great infrastructural cost and carbon footprint in modern data…

Machine Learning · Computer Science 2020-10-22 Mao Ye , Dhruv Choudhary , Jiecao Yu , Ellie Wen , Zeliang Chen , Jiyan Yang , Jongsoo Park , Qiang Liu , Arun Kejariwal

LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Pucheng Zhai , Kailing Guo , Fang Liu , Xiaofen Xing , Xiangmin Xu

Network Pruning via Annealing and Direct Sparsity Control

Artificial neural networks (ANNs) especially deep convolutional networks are very popular these days and have been proved to successfully offer quite reliable solutions to many vision problems. However, the use of deep neural networks is…

Machine Learning · Computer Science 2020-07-28 Yangzi Guo , Yiyuan She , Adrian Barbu

Automatic Attention Pruning: Improving and Automating Model Pruning using Attentions

Pruning is a promising approach to compress deep learning models in order to deploy them on resource-constrained edge devices. However, many existing pruning solutions are based on unstructured pruning, which yields models that cannot…

Machine Learning · Computer Science 2023-03-16 Kaiqi Zhao , Animesh Jain , Ming Zhao

Pruning artificial neural networks: a way to find well-generalizing, high-entropy sharp minima

Recently, a race towards the simplification of deep networks has begun, showing that it is effectively possible to reduce the size of these models with minimal or no performance loss. However, there is a general lack in understanding why…

Machine Learning · Computer Science 2022-12-29 Enzo Tartaglione , Andrea Bragagnolo , Marco Grangetto

Alignment-Constrained Dynamic Pruning for LLMs: Identifying and Preserving Alignment-Critical Circuits

Large Language Models require substantial computational resources for inference, posing deployment challenges. While dynamic pruning offers superior efficiency over static methods through adaptive circuit selection, it exacerbates alignment…

Machine Learning · Computer Science 2025-11-12 Dev Patel , Gabrielle Gervacio , Diekola Raimi , Kevin Zhu , Ryan Lagasse , Gabriel Grand , Ashwinee Panda , Maheep Chaudhary

Effective Model Sparsification by Scheduled Grow-and-Prune Methods

Deep neural networks (DNNs) are effective in solving many real-world problems. Larger DNN models usually exhibit better quality (e.g., accuracy) but their excessive computation results in long inference time. Model sparsification can reduce…

Computer Vision and Pattern Recognition · Computer Science 2022-03-07 Xiaolong Ma , Minghai Qin , Fei Sun , Zejiang Hou , Kun Yuan , Yi Xu , Yanzhi Wang , Yen-Kuang Chen , Rong Jin , Yuan Xie

Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

Robustness and compactness are two essential attributes of deep learning models that are deployed in the real world. The goals of robustness and compactness may seem to be at odds, since robustness requires generalization across domains,…

Machine Learning · Computer Science 2024-03-14 Anna Bair , Hongxu Yin , Maying Shen , Pavlo Molchanov , Jose Alvarez

Data-Informed Global Sparseness in Attention Mechanisms for Deep Neural Networks

Attention mechanisms play a crucial role in the neural revolution of Natural Language Processing (NLP). With the growth of attention-based models, several pruning techniques have been developed to identify and exploit sparseness, making…

Computation and Language · Computer Science 2024-05-20 Ileana Rugina , Rumen Dangovski , Li Jing , Preslav Nakov , Marin Soljačić

AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks

Sparse training is emerging as a promising avenue for reducing the computational cost of training neural networks. Several recent studies have proposed pruning methods using learnable thresholds to efficiently explore the non-uniform…

Machine Learning · Computer Science 2023-04-17 Abhisek Kundu , Naveen K. Mellempudi , Dharma Teja Vooturi , Bharat Kaul , Pradeep Dubey

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

Deep neural networks often suffer from poor generalization caused by complex and non-convex loss landscapes. One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized…

Machine Learning · Computer Science 2022-10-25 Peng Mi , Li Shen , Tianhe Ren , Yiyi Zhou , Xiaoshuai Sun , Rongrong Ji , Dacheng Tao

Layer-adaptive sparsity for the Magnitude-based Pruning

Recent discoveries on neural network pruning reveal that, with a carefully chosen layerwise sparsity, a simple magnitude-based pruning achieves state-of-the-art tradeoff between sparsity and performance. However, without a clear consensus…

Machine Learning · Computer Science 2021-05-11 Jaeho Lee , Sejun Park , Sangwoo Mo , Sungsoo Ahn , Jinwoo Shin

Efficient LLMs with AMP: Attention Heads and MLP Pruning

Deep learning drives a new wave in computing systems and triggers the automation of increasingly complex problems. In particular, Large Language Models (LLMs) have significantly advanced cognitive tasks, often matching or even surpassing…

Machine Learning · Computer Science 2025-05-01 Leandro Giusti Mugnaini , Bruno Lopes Yamamoto , Lucas Lauton de Alcantara , Victor Zacarias , Edson Bollis , Lucas Pellicer , Anna Helena Reali Costa , Artur Jordao

Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training

Network pruning focuses on algorithms that aim to reduce a given model's computational cost by removing a subset of its parameters while having minimal impact on performance. Throughout the last decade, the most widely used pruning paradigm…

Machine Learning · Computer Science 2025-11-11 Elia Cunegatti , Leonardo Lucio Custode , Giovanni Iacca

Single-Shot Pruning for Offline Reinforcement Learning

Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems. Large neural networks employed in the framework are traditionally associated with better generalization capabilities, but their increased size…

Machine Learning · Computer Science 2022-01-03 Samin Yeasar Arnob , Riyasat Ohib , Sergey Plis , Doina Precup

Learning Instance-wise Sparsity for Accelerating Deep Models

Exploring deep convolutional neural networks of high efficiency and low memory usage is very essential for a wide variety of machine learning tasks. Most of existing approaches used to accelerate deep models by manipulating parameters or…

Computer Vision and Pattern Recognition · Computer Science 2019-07-30 Chuanjian Liu , Yunhe Wang , Kai Han , Chunjing Xu , Chang Xu

Weight Pruning via Adaptive Sparsity Loss

Pruning neural networks has regained interest in recent years as a means to compress state-of-the-art deep neural networks and enable their deployment on resource-constrained devices. In this paper, we propose a robust compressive learning…

Machine Learning · Computer Science 2020-06-05 George Retsinas , Athena Elafrou , Georgios Goumas , Petros Maragos

Growing Networks with Autonomous Pruning

This paper introduces Growing Networks with Autonomous Pruning (GNAP) for image classification. Unlike traditional convolutional neural networks, GNAP change their size, as well as the number of parameters they are using, during training,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Charles De Lambilly , Stefan Duffner

APP: Accelerated Path Patching with Task-Specific Pruning

Circuit discovery is a key step in many mechanistic interpretability pipelines. Current methods, such as Path Patching, are computationally expensive and have limited in-depth circuit analysis for smaller models. In this study, we propose…

Machine Learning · Computer Science 2025-11-10 Frauke Andersen , William Rudman , Ruochen Zhang , Carsten Eickhoff

Structurally Prune Anything: Any Architecture, Any Framework, Any Time

Neural network pruning serves as a critical technique for enhancing the efficiency of deep learning models. Unlike unstructured pruning, which only sets specific parameters to zero, structured pruning eliminates entire channels, thus…

Machine Learning · Computer Science 2024-03-29 Xun Wang , John Rachwan , Stephan Günnemann , Bertrand Charpentier