Related papers: A Main/Subsidiary Network Framework for Simplifyin…

Network Pruning for Low-Rank Binary Indexing

Pruning is an efficient model compression technique to remove redundancy in the connectivity of deep neural networks (DNNs). Computations using sparse matrices obtained by pruning parameters, however, exhibit vastly different parallelism…

Machine Learning · Computer Science 2019-05-15 Dongsoo Lee , Se Jung Kwon , Byeongwook Kim , Parichay Kapoor , Gu-Yeon Wei

Neural Network Pruning Through Constrained Reinforcement Learning

Network pruning reduces the size of neural networks by removing (pruning) neurons such that the performance drop is minimal. Traditional pruning approaches focus on designing metrics to quantify the usefulness of a neuron which is often…

Computer Vision and Pattern Recognition · Computer Science 2021-11-01 Shehryar Malik , Muhammad Umair Haider , Omer Iqbal , Murtaza Taj

A "Network Pruning Network" Approach to Deep Model Compression

We present a filter pruning approach for deep model compression, using a multitask network. Our approach is based on learning a a pruner network to prune a pre-trained target network. The pruner is essentially a multitask deep neural…

Computer Vision and Pattern Recognition · Computer Science 2020-01-17 Vinay Kumar Verma , Pravendra Singh , Vinay P. Namboodiri , Piyush Rai

Binary Neural Networks: A Survey

The binary neural network, largely saving the storage and computation, serves as a promising technique for deploying deep models on resource-limited devices. However, the binarization inevitably causes severe information loss, and even…

Neural and Evolutionary Computing · Computer Science 2020-04-08 Haotong Qin , Ruihao Gong , Xianglong Liu , Xiao Bai , Jingkuan Song , Nicu Sebe

Concurrent Training and Layer Pruning of Deep Neural Networks

We propose an algorithm capable of identifying and eliminating irrelevant layers of a neural network during the early stages of training. In contrast to weight or filter-level pruning, layer pruning reduces the harder to parallelize…

Machine Learning · Computer Science 2024-06-10 Valentin Frank Ingmar Guenter , Athanasios Sideris

Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon

How to develop slim and accurate deep neural networks has become crucial for real- world applications, especially for those employed in embedded systems. Though previous work along this research line has shown some promising results, most…

Neural and Evolutionary Computing · Computer Science 2019-10-02 Xin Dong , Shangyu Chen , Sinno Jialin Pan

Partial Binarization of Neural Networks for Budget-Aware Efficient Learning

Binarization is a powerful compression technique for neural networks, significantly reducing FLOPs, but often results in a significant drop in model performance. To address this issue, partial binarization techniques have been developed,…

Computer Vision and Pattern Recognition · Computer Science 2023-12-07 Udbhav Bamba , Neeraj Anand , Saksham Aggarwal , Dilip K. Prasad , Deepak K. Gupta

Automatic Pruning for Quantized Neural Networks

Neural network quantization and pruning are two techniques commonly used to reduce the computational complexity and memory footprint of these models for deployment. However, most existing pruning strategies operate on full-precision and…

Computer Vision and Pattern Recognition · Computer Science 2020-02-04 Luis Guerra , Bohan Zhuang , Ian Reid , Tom Drummond

Towards thinner convolutional neural networks through Gradually Global Pruning

Deep network pruning is an effective method to reduce the storage and computation cost of deep neural networks when applying them to resource-limited devices. Among many pruning granularities, neuron level pruning will remove redundant…

Computer Vision and Pattern Recognition · Computer Science 2017-03-30 Zhengtao Wang , Ce Zhu , Zhiqiang Xia , Qi Guo , Yipeng Liu

Data-Efficient Structured Pruning via Submodular Optimization

Structured pruning is an effective approach for compressing large pre-trained neural networks without significantly affecting their performance. However, most current structured pruning methods do not provide any performance guarantees, and…

Machine Learning · Computer Science 2023-02-14 Marwa El Halabi , Suraj Srinivas , Simon Lacoste-Julien

ReBNet: Residual Binarized Neural Network

This paper proposes ReBNet, an end-to-end framework for training reconfigurable binary neural networks on software and developing efficient accelerators for execution on FPGA. Binary neural networks offer an intriguing opportunity for…

Machine Learning · Computer Science 2018-03-29 Mohammad Ghasemzadeh , Mohammad Samragh , Farinaz Koushanfar

Flexible Network Binarization with Layer-wise Priority

How to effectively approximate real-valued parameters with binary codes plays a central role in neural network binarization. In this work, we reveal an important fact that binarizing different layers has a widely-varied effect on the…

Computer Vision and Pattern Recognition · Computer Science 2018-02-19 Lixue Zhuang , Yi Xu , Bingbing Ni , Hongteng Xu

Binary Stochastic Filtering: a Method for Neural Network Size Minimization and Supervised Feature Selection

Binary Stochastic Filtering (BSF), the algorithm for feature selection and neuron pruning is proposed in this work. The method defines filtering layer which penalizes amount of the information involved in the training process. This…

Machine Learning · Computer Science 2019-08-21 Andrii Trelin , Ales Prochazka

Structural Pruning in Deep Neural Networks: A Small-World Approach

Deep Neural Networks (DNNs) are usually over-parameterized, causing excessive memory and interconnection cost on the hardware platform. Existing pruning approaches remove secondary parameters at the end of training to reduce the model size;…

Machine Learning · Computer Science 2019-11-12 Gokul Krishnan , Xiaocong Du , Yu Cao

Neural Network Compression via Effective Filter Analysis and Hierarchical Pruning

Network compression is crucial to making the deep networks to be more efficient, faster, and generalizable to low-end hardware. Current network compression methods have two open problems: first, there lacks a theoretical framework to…

Machine Learning · Computer Science 2022-06-09 Ziqi Zhou , Li Lian , Yilong Yin , Ze Wang

Layer-compensated Pruning for Resource-constrained Convolutional Neural Networks

Resource-efficient convolution neural networks enable not only the intelligence on edge devices but also opportunities in system-level optimization such as scheduling. In this work, we aim to improve the performance of resource-constrained…

Computer Vision and Pattern Recognition · Computer Science 2018-10-19 Ting-Wu Chin , Cha Zhang , Diana Marculescu

Binary domain generalization for sparsifying binary neural networks

Binary neural networks (BNNs) are an attractive solution for developing and deploying deep neural network (DNN)-based applications in resource constrained devices. Despite their success, BNNs still suffer from a fixed and limited…

Machine Learning · Computer Science 2023-06-26 Riccardo Schiavone , Francesco Galati , Maria A. Zuluaga

Pruning from Scratch

Network pruning is an important research field aiming at reducing computational costs of neural networks. Conventional approaches follow a fixed paradigm which first trains a large and redundant network, and then determines which units…

Computer Vision and Pattern Recognition · Computer Science 2019-09-30 Yulong Wang , Xiaolu Zhang , Lingxi Xie , Jun Zhou , Hang Su , Bo Zhang , Xiaolin Hu

Pruning Everything, Everywhere, All at Once

Deep learning stands as the modern paradigm for solving cognitive tasks. However, as the problem complexity increases, models grow deeper and computationally prohibitive, hindering advancements in real-world and resource-constrained…

Computer Vision and Pattern Recognition · Computer Science 2025-06-06 Gustavo Henrique do Nascimento , Ian Pons , Anna Helena Reali Costa , Artur Jordao

An experimental comparative study of backpropagation and alternatives for training binary neural networks for image classification

Current artificial neural networks are trained with parameters encoded as floating point numbers that occupy lots of memory space at inference time. Due to the increase in the size of deep learning models, it is becoming very difficult to…

Machine Learning · Computer Science 2024-08-09 Ben Crulis , Barthelemy Serres , Cyril de Runz , Gilles Venturini