Related papers: Filter Sketch for Network Pruning
Neural networks performance has been significantly improved in the last few years, at the cost of an increasing number of floating point operations per second (FLOPs). However, more FLOPs can be an issue when computational resources are…
Neural network pruning offers a promising prospect to facilitate deploying deep neural networks on resource-limited devices. However, existing methods are still challenged by the training inefficiency and labor cost in pruning designs, due…
Filter pruning is effective to reduce the computational costs of neural networks. Existing methods show that updating the previous pruned filter would enable large model capacity and achieve better performance. However, during the iterative…
Despite the promising results of convolutional neural networks (CNNs), their application on devices with limited resources is still a big challenge; this is mainly due to the huge memory and computation requirements of the CNN. To counter…
Convolutional Neural Network (CNN) has an amount of parameter redundancy, filter pruning aims to remove the redundant filters and provides the possibility for the application of CNN on terminal devices. However, previous works pay more…
Recent advances in pruning of neural networks have made it possible to remove a large number of filters or weights without any perceptible drop in accuracy. The number of parameters and that of FLOPs are usually the reported metrics to…
Network pruning is an important research field aiming at reducing computational costs of neural networks. Conventional approaches follow a fixed paradigm which first trains a large and redundant network, and then determines which units…
Acceleration of convolutional neural network has received increasing attention during the past several years. Among various acceleration techniques, filter pruning has its inherent merit by effectively reducing the number of convolution…
Network pruning reduces the computation costs of an over-parameterized network without performance damage. Prevailing pruning algorithms pre-define the width and depth of the pruned networks, and then transfer parameters from the unpruned…
Neural network pruning reduces the computational cost of an over-parameterized network to improve its efficiency. Popular methods vary from $\ell_1$-norm sparsification to Neural Architecture Search (NAS). In this work, we propose a novel…
As sketch research has collectively matured over time, its adaptation for at-mass commercialisation emerges on the immediate horizon. Despite an already mature research endeavour for photos, there is no research on the efficient inference…
Even though norm-based filter pruning methods are widely accepted, it is questionable whether the "smaller-norm-less-important" criterion is optimal in determining filters to prune. Especially when we can keep only a small fraction of the…
Filter pruning has gained widespread adoption for the purpose of compressing and speeding up convolutional neural networks (CNNs). However, existing approaches are still far from practical applications due to biased filter selection and…
Channel pruning is one of the predominant approaches for accelerating deep neural networks. Most existing pruning methods either train from scratch with a sparsity inducing term such as group lasso, or prune redundant channels in a…
The success of CNNs in various applications is accompanied by a significant increase in the computation and parameter storage costs. Recent efforts toward reducing these overheads involve pruning and compressing the weights of various…
Resource-efficient convolution neural networks enable not only the intelligence on edge devices but also opportunities in system-level optimization such as scheduling. In this work, we aim to improve the performance of resource-constrained…
Neural network pruning is a widely used strategy for reducing model storage and computing requirements. It allows to lower the complexity of the network by introducing sparsity in the weights. Because taking advantage of sparse matrices is…
This paper focuses on filter-level network pruning. A novel pruning method, termed CLR-RNF, is proposed. We first reveal a "long-tail" long-tail pruning problem in magnitude-based weight pruning methods, and then propose a computation-aware…
Network pruning techniques, including weight pruning and filter pruning, reveal that most state-of-the-art neural networks can be accelerated without a significant performance drop. This work focuses on filter pruning which enables…
Structured network pruning excels non-structured methods because they can take advantage of the thriving developed parallel computing techniques. In this paper, we propose a new structured pruning method. Firstly, to create more structured…