English
Related papers

Related papers: Towards Optimal Compression: Joint Pruning and Qua…

200 papers

Deep Neural Networks (DNNs) have achieved significant advances in a wide range of applications. However, their deployment on resource-constrained devices remains a challenge due to the large number of layers and parameters, which result in…

Neural and Evolutionary Computing · Computer Science 2025-09-05 Sara Makenali , Babak Rokh , Ali Azarpeyvand

Deep neural networks have been applied in many applications exhibiting extraordinary abilities in the field of computer vision. However, complex network architectures challenge efficient real-time deployment and require significant…

Computer Vision and Pattern Recognition · Computer Science 2021-06-16 Tailin Liang , John Glossner , Lei Wang , Shaobo Shi , Xiaotong Zhang

Quantization and pruning are core techniques used to reduce the inference costs of deep neural networks. State-of-the-art quantization techniques are currently applied to both the weights and activations; however, pruning is most often…

Machine Learning · Computer Science 2021-11-02 Xinyu Zhang , Ian Colbert , Ken Kreutz-Delgado , Srinjoy Das

In the traditional deep compression framework, iteratively performing network pruning and quantization can reduce the model size and computation cost to meet the deployment requirements. However, such a step-wise application of pruning and…

Computer Vision and Pattern Recognition · Computer Science 2020-11-13 Wenting Tang , Xingxing Wei , Bo Li

Model compression is generally performed by using quantization, low-rank approximation or pruning, for which various algorithms have been researched in recent years. One fundamental question is: what types of compression work better for a…

Machine Learning · Computer Science 2021-07-12 Miguel Á. Carreira-Perpiñán , Yerlan Idelbayev

Inference time, model size, and accuracy are three key factors in deep model compression. Most of the existing work addresses these three key factors separately as it is difficult to optimize them all at the same time. For example, low-bit…

Computer Vision and Pattern Recognition · Computer Science 2023-07-18 Dan Liu , Xi Chen , Jie Fu , Chen Ma , Xue Liu

The resource requirements of deep neural networks (DNNs) pose significant challenges to their deployment on edge devices. Common approaches to address this issue are pruning and mixed-precision quantization, which lead to latency and memory…

We consider the problem of deep neural net compression by quantization: given a large, reference net, we want to quantize its real-valued weights using a codebook with $K$ entries so that the training loss of the quantized net is minimal.…

Machine Learning · Computer Science 2017-07-17 Miguel Á. Carreira-Perpiñán , Yerlan Idelbayev

Deep neural networks have achieved exceptional results across a range of applications. As the demand for efficient and sparse deep learning models escalates, the significance of model compression, particularly pruning, is increasingly…

Machine Learning · Computer Science 2025-04-01 Yucong Dai , Gen Li , Feng Luo , Xiaolong Ma , Yongkai Wu

Quantization and pruning are two effective Deep Neural Networks model compression methods. In this paper, we propose Automatic Prune Binarization (APB), a novel compression technique combining quantization with pruning. APB enhances the…

Computer Vision and Pattern Recognition · Computer Science 2023-09-18 Franco Maria Nardini , Cosimo Rulli , Salvatore Trani , Rossano Venturini

We present a differentiable joint pruning and quantization (DJPQ) scheme. We frame neural network compression as a joint gradient-based optimization problem, trading off between model pruning and quantization automatically for hardware…

Machine Learning · Computer Science 2021-04-06 Ying Wang , Yadong Lu , Tijmen Blankevoort

Existing high-performance deep learning models require very intensive computing. For this reason, it is difficult to embed a deep learning model into a system with limited resources. In this paper, we propose the novel idea of the network…

Machine Learning · Computer Science 2019-02-13 Dae-Woong Jeong , Jaehun Kim , Youngseok Kim , Tae-Ho Kim , Myungsu Chae

Deep learning models have achieved tremendous success in most of the industries in recent years. The evolution of these models has also led to an increase in the model size and energy requirement, making it difficult to deploy in production…

Machine Learning · Computer Science 2024-07-24 Aayush Saxena , Arit Kumar Bishwas , Ayush Ashok Mishra , Ryan Armstrong

This work evaluates the compression techniques on ConvNeXt models in image classification tasks using the CIFAR-10 dataset. Structured pruning, unstructured pruning, and dynamic quantization methods are evaluated to reduce model size and…

Machine Learning · Computer Science 2024-09-05 Samer Francy , Raghubir Singh

In this work we present a method to improve the pruning step of the current state-of-the-art methodology to compress neural networks. The novelty of the proposed pruning technique is in its differentiability, which allows pruning to be…

Computer Vision and Pattern Recognition · Computer Science 2019-01-08 Franco Manessi , Alessandro Rozza , Simone Bianco , Paolo Napoletano , Raimondo Schettini

With time, machine learning models have increased in their scope, functionality and size. Consequently, the increased functionality and size of such models requires high-end hardware to both train and provide inference after the fact. This…

Machine Learning · Computer Science 2021-09-07 Arhum Ishtiaq , Sara Mahmood , Maheen Anees , Neha Mumtaz

Model compression has gained a lot of attention due to its ability to reduce hardware resource requirements significantly while maintaining accuracy of DNNs. Model compression is especially useful for memory-intensive recurrent neural…

Machine Learning · Computer Science 2018-05-30 Dongsoo Lee , Byeongwook Kim

We present a filter pruning approach for deep model compression, using a multitask network. Our approach is based on learning a a pruner network to prune a pre-trained target network. The pruner is essentially a multitask deep neural…

Computer Vision and Pattern Recognition · Computer Science 2020-01-17 Vinay Kumar Verma , Pravendra Singh , Vinay P. Namboodiri , Piyush Rai

Deep neural networks (DNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with…

Machine Learning · Computer Science 2020-06-16 Yu Cheng , Duo Wang , Pan Zhou , Tao Zhang

Network compression is crucial to making the deep networks to be more efficient, faster, and generalizable to low-end hardware. Current network compression methods have two open problems: first, there lacks a theoretical framework to…

Machine Learning · Computer Science 2022-06-09 Ziqi Zhou , Li Lian , Yilong Yin , Ze Wang
‹ Prev 1 2 3 10 Next ›