English
Related papers

Related papers: Training Quantized Nets: A Deeper Understanding

200 papers

Deep neural networks are the state-of-the-art methods for many real-world tasks, such as computer vision, natural language processing and speech recognition. For all its popularity, deep neural networks are also criticized for consuming a…

Machine Learning · Computer Science 2018-12-18 Yunhui Guo

This paper tackles the problem of training a deep convolutional neural network of both low-bitwidth weights and activations. Optimizing a low-precision network is very challenging due to the non-differentiability of the quantizer, which may…

Computer Vision and Pattern Recognition · Computer Science 2021-06-07 Bohan Zhuang , Jing Liu , Mingkui Tan , Lingqiao Liu , Ian Reid , Chunhua Shen

Deep neural networks (DNNs) are quantized for efficient inference on resource-constrained platforms. However, training deep learning models with low-precision weights and activations involves a demanding optimization task, which calls for…

Machine Learning · Computer Science 2021-05-25 Ziang Long , Penghang Yin , Jack Xin

Enabling low precision implementations of deep learning models, without considerable performance degradation, is necessary in resource and latency constrained settings. Moreover, exploiting the differences in sensitivity to quantization…

Machine Learning · Computer Science 2022-10-28 Ignacio Hounie , Juan Elenter , Alejandro Ribeiro

We introduce a method to train Quantized Neural Networks (QNNs) --- neural networks with extremely low precision (e.g., 1-bit) weights and activations, at run-time. At train-time the quantized weights and activations are used for computing…

Neural and Evolutionary Computing · Computer Science 2016-09-23 Itay Hubara , Matthieu Courbariaux , Daniel Soudry , Ran El-Yaniv , Yoshua Bengio

Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases. Here, we present a…

Machine Learning · Computer Science 2020-05-08 Steven K. Esser , Jeffrey L. McKinstry , Deepika Bablani , Rathinakumar Appuswamy , Dharmendra S. Modha

Quantizing deep networks with adaptive bit-widths is a promising technique for efficient inference across many devices and resource constraints. In contrast to static methods that repeat the quantization process and train different models…

Computer Vision and Pattern Recognition · Computer Science 2021-09-20 Ximeng Sun , Rameswar Panda , Chun-Fu Chen , Naigang Wang , Bowen Pan , Kailash Gopalakrishnan , Aude Oliva , Rogerio Feris , Kate Saenko

Deep neural networks (DNNs) can be made hardware-efficient by reducing the numerical precision of the weights and activations of the network and by improving the network's resilience to noise. However, this gain in efficiency often comes at…

Although deep neural networks are highly effective, their high computational and memory costs severely challenge their applications on portable devices. As a consequence, low-bit quantization, which converts a full-precision neural network…

Computer Vision and Pattern Recognition · Computer Science 2019-12-02 Jiwei Yang , Xu Shen , Jun Xing , Xinmei Tian , Houqiang Li , Bing Deng , Jianqiang Huang , Xiansheng Hua

Although weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of…

Computer Vision and Pattern Recognition · Computer Science 2018-07-27 Dongqing Zhang , Jiaolong Yang , Dongqiangzi Ye , Gang Hua

Neural networks (NNs) have been extremely successful across many tasks in machine learning. Quantization of NN weights has become an important topic due to its impact on their energy efficiency, inference time and deployment on hardware.…

Machine Learning · Computer Science 2021-05-06 Burak Bartan , Mert Pilanci

This paper tackles the problem of training a deep convolutional neural network with both low-precision weights and low-bitwidth activations. Optimizing a low-precision network is very challenging since the training process can easily get…

Computer Vision and Pattern Recognition · Computer Science 2021-06-05 Bohan Zhuang , Chunhua Shen , Mingkui Tan , Lingqiao Liu , Ian Reid

We consider the problem of deep neural net compression by quantization: given a large, reference net, we want to quantize its real-valued weights using a codebook with $K$ entries so that the training loss of the quantized net is minimal.…

Machine Learning · Computer Science 2017-07-17 Miguel Á. Carreira-Perpiñán , Yerlan Idelbayev

Quantized deep neural networks (QDNNs) are necessary for low-power, high throughput, and embedded applications. Previous studies mostly focused on developing optimization methods for the quantization of given models. However, quantization…

Machine Learning · Computer Science 2020-06-02 Yoonho Boo , Sungho Shin , Wonyong Sung

Gradient descent methods have long been the de facto standard for training deep neural networks. Millions of training samples are fed into models with billions of parameters, which are slowly updated over hundreds of epochs. Recently, it's…

Machine Learning · Computer Science 2023-02-14 Tim Whitaker

Recent years have witnessed the great advance of deep learning in a variety of vision tasks. Many state-of-the-art deep neural networks suffer from large size and high complexity, which makes it difficult to deploy in resource-limited…

Computer Vision and Pattern Recognition · Computer Science 2019-05-29 Zhengguang Zhou , Wengang Zhou , Xutao Lv , Xuan Huang , Xiaoyu Wang , Houqiang Li

In low-latency or mobile applications, lower computation complexity, lower memory footprint and better energy efficiency are desired. Many prior works address this need by removing redundant parameters. Parameter quantization replaces…

Machine Learning · Computer Science 2021-11-16 Cheng-Chou Lan

Convolutional neural networks require significant memory bandwidth and storage for intermediate computations, apart from substantial computing resources. Neural network quantization has significant benefits in reducing the amount of…

Computer Vision and Pattern Recognition · Computer Science 2019-05-30 Ron Banner , Yury Nahshan , Elad Hoffer , Daniel Soudry

Although considerable progress has been obtained in neural network quantization for efficient inference, existing methods are not scalable to heterogeneous devices as one dedicated model needs to be trained, transmitted, and stored for one…

Machine Learning · Computer Science 2022-12-13 Hai Wu , Ruifei He , Haoru Tan , Xiaojuan Qi , Kaibin Huang

We present an overview of techniques for quantizing convolutional neural networks for inference with integer weights and activations. Per-channel quantization of weights and per-layer quantization of activations to 8-bits of precision…

Machine Learning · Computer Science 2018-06-22 Raghuraman Krishnamoorthi
‹ Prev 1 2 3 10 Next ›