English
Related papers

Related papers: A Greedy Algorithm for Quantizing Neural Networks

200 papers

In this short note, we propose a new method for quantizing the weights of a fully trained neural network. A simple deterministic pre-processing step allows us to quantize network layers via memoryless scalar quantization while preserving…

Machine Learning · Computer Science 2023-04-06 Johannes Maly , Rayan Saab

Quantization is a widely used compression method that effectively reduces redundancies in over-parameterized neural networks. However, existing quantization techniques for deep neural networks often lack a comprehensive error analysis due…

Machine Learning · Computer Science 2023-09-21 Jinjie Zhang , Rayan Saab

While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized…

Machine Learning · Computer Science 2023-01-18 Jinjie Zhang , Yixuan Zhou , Rayan Saab

Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases. Here, we present a…

Machine Learning · Computer Science 2020-05-08 Steven K. Esser , Jeffrey L. McKinstry , Deepika Bablani , Rathinakumar Appuswamy , Dharmendra S. Modha

We present a greedy-based approach to construct an efficient single hidden layer neural network with the ReLU activation that approximates a target function. In our approach we obtain a shallow network by utilizing a greedy algorithm with…

Machine Learning · Computer Science 2021-10-01 Anton Dereventsov , Armenak Petrosyan , Clayton Webster

Deep neural networks (DNNs) are quantized for efficient inference on resource-constrained platforms. However, training deep learning models with low-precision weights and activations involves a demanding optimization task, which calls for…

Machine Learning · Computer Science 2021-05-25 Ziang Long , Penghang Yin , Jack Xin

In low-latency or mobile applications, lower computation complexity, lower memory footprint and better energy efficiency are desired. Many prior works address this need by removing redundant parameters. Parameter quantization replaces…

Machine Learning · Computer Science 2021-11-16 Cheng-Chou Lan

Although considerable progress has been obtained in neural network quantization for efficient inference, existing methods are not scalable to heterogeneous devices as one dedicated model needs to be trained, transmitted, and stored for one…

Machine Learning · Computer Science 2022-12-13 Hai Wu , Ruifei He , Haoru Tan , Xiaojuan Qi , Kaibin Huang

Currently, deep neural networks are deployed on low-power portable devices by first training a full-precision model using powerful hardware, and then deriving a corresponding low-precision model for efficient inference on such systems.…

Machine Learning · Computer Science 2017-11-15 Hao Li , Soham De , Zheng Xu , Christoph Studer , Hanan Samet , Tom Goldstein

We propose a new scalable method to optimize the architecture of an artificial neural network. The proposed algorithm, called Greedy Search for Neural Network Architecture, aims to determine a neural network with minimal number of layers…

Machine Learning · Computer Science 2021-04-30 Massimiliano Lupo Pasini , Junqi Yin , Ying Wai Li , Markus Eisenbach

Multilayer networks have seen a resurgence under the umbrella of deep learning. Current deep learning algorithms train the layers of the network sequentially, improving algorithmic performance as well as providing some regularization. We…

Machine Learning · Computer Science 2016-02-22 Ke Wu , Malik Magdon-Ismail

We propose a new method for learning deep neural network models that is based on a greedy learning approach: we add one basis function at a time, and a new basis function is generated as a non-linear activation function applied to a linear…

Machine Learning · Computer Science 2020-02-18 Daria Fokina , Ivan Oseledets

We propose a greedy algorithm to select $N$ important features among $P$ input features for a non-linear prediction problem. The features are selected one by one sequentially, in an iterative loss minimization procedure. We use neural…

Machine Learning · Computer Science 2023-09-14 Sandipan Das , Alireza M. Javid , Prakash Borpatra Gohain , Yonina C. Eldar , Saikat Chatterjee

We introduce a quantization-aware training algorithm that guarantees avoiding numerical overflow when reducing the precision of accumulators during inference. We leverage weight normalization as a means of constraining parameters during…

Machine Learning · Computer Science 2023-02-01 Ian Colbert , Alessandro Pappalardo , Jakoba Petri-Koenig

Despite the great success of deep learning, recent works show that large deep neural networks are often highly redundant and can be significantly reduced in size. However, the theoretical question of how much we can prune a neural network…

Machine Learning · Computer Science 2020-11-02 Mao Ye , Lemeng Wu , Qiang Liu

We introduce a method to train Quantized Neural Networks (QNNs) --- neural networks with extremely low precision (e.g., 1-bit) weights and activations, at run-time. At train-time the quantized weights and activations are used for computing…

Neural and Evolutionary Computing · Computer Science 2016-09-23 Itay Hubara , Matthieu Courbariaux , Daniel Soudry , Ran El-Yaniv , Yoshua Bengio

Although weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of…

Computer Vision and Pattern Recognition · Computer Science 2018-07-27 Dongqing Zhang , Jiaolong Yang , Dongqiangzi Ye , Gang Hua

Convolutional Neural Networks (CNNs) have proven to be a powerful state-of-the-art method for image classification tasks. One drawback however is the high computational complexity and high memory consumption of CNNs which makes them…

Computer Vision and Pattern Recognition · Computer Science 2021-02-04 Rishabh Goyal , Joaquin Vanschoren , Victor van Acht , Stephan Nijssen

Quantization of deep neural networks is a promising approach that reduces the inference cost, making it feasible to run deep networks on resource-restricted devices. Inspired by existing methods, we propose a new framework to learn the…

Machine Learning · Computer Science 2022-02-28 Amir Ardakani , Arash Ardakani , Brett Meyer , James J. Clark , Warren J. Gross

Greedy layer-wise or module-wise training of neural networks is compelling in constrained and on-device settings where memory is limited, as it circumvents a number of problems of end-to-end back-propagation. However, it suffers from a…

Machine Learning · Computer Science 2023-10-06 Skander Karkar , Ibrahim Ayed , Emmanuel de Bézenac , Patrick Gallinari
‹ Prev 1 2 3 10 Next ›