Related papers: On Quantizing Implicit Neural Representations

Model compression as constrained optimization, with application to neural nets. Part II: quantization

We consider the problem of deep neural net compression by quantization: given a large, reference net, we want to quantize its real-valued weights using a codebook with $K$ entries so that the training loss of the quantized net is minimal.…

Machine Learning · Computer Science 2017-07-17 Miguel Á. Carreira-Perpiñán , Yerlan Idelbayev

Neural Image Compression with Quantization Rectifier

Neural image compression has been shown to outperform traditional image codecs in terms of rate-distortion performance. However, quantization introduces errors in the compression process, which can degrade the quality of the compressed…

Machine Learning · Computer Science 2024-03-27 Wei Luo , Bo Chen

QGen: On the Ability to Generalize in Quantization Aware Training

Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a…

Machine Learning · Computer Science 2024-04-22 MohammadHossein AskariHemmat , Ahmadreza Jeddi , Reyhane Askari Hemmat , Ivan Lazarevich , Alexander Hoffman , Sudhakar Sah , Ehsan Saboori , Yvon Savaria , Jean-Pierre David

Quantization of Deep Neural Networks for Accurate Edge Computing

Deep neural networks (DNNs) have demonstrated their great potential in recent years, exceeding the per-formance of human experts in a wide range of applications. Due to their large sizes, however, compressiontechniques such as weight…

Computer Vision and Pattern Recognition · Computer Science 2021-10-15 Wentao Chen , Hailong Qiu , Jian Zhuang , Chutong Zhang , Yu Hu , Qing Lu , Tianchen Wang , Yiyu Shi , Meiping Huang , Xiaowe Xu

Quantization-Aware Regularizers for Deep Neural Networks Compression

Deep Neural Networks reached state-of-the-art performance across numerous domains, but this progress has come at the cost of increasingly large and over-parameterized models, posing serious challenges for deployment on resource-constrained…

Machine Learning · Computer Science 2026-02-04 Dario Malchiodi , Mattia Ferraretto , Marco Frasca

Retraining-Based Iterative Weight Quantization for Deep Neural Networks

Model compression has gained a lot of attention due to its ability to reduce hardware resource requirements significantly while maintaining accuracy of DNNs. Model compression is especially useful for memory-intensive recurrent neural…

Machine Learning · Computer Science 2018-05-30 Dongsoo Lee , Byeongwook Kim

Alternating Multi-bit Quantization for Recurrent Neural Networks

Recurrent neural networks have achieved excellent performance in many applications. However, on portable devices with limited resources, the models are often too large to deploy. For applications on the server with large scale concurrent…

Machine Learning · Computer Science 2018-02-02 Chen Xu , Jianqiang Yao , Zhouchen Lin , Wenwu Ou , Yuanbin Cao , Zhirong Wang , Hongbin Zha

Coordinate Quantized Neural Implicit Representations for Multi-view Reconstruction

In recent years, huge progress has been made on learning neural implicit representations from multi-view images for 3D reconstruction. As an additional input complementing coordinates, using sinusoidal functions as positional encodings…

Computer Vision and Pattern Recognition · Computer Science 2023-08-23 Sijia Jiang , Jing Hua , Zhizhong Han

Quantization of Neural Network Equalizers in Optical Fiber Transmission Experiments

The quantization of neural networks for the mitigation of the nonlinear and components' distortions in dual-polarization optical fiber transmission is studied. Two low-complexity neural network equalizers are applied in three 16-QAM 34.4…

Signal Processing · Electrical Eng. & Systems 2023-10-11 Jamal Darweesh , Nelson Costa , Antonio Napoli , Bernhard Spinnler , Yves Jaouen , Mansoor Yousefi

Widening and Squeezing: Towards Accurate and Efficient QNNs

Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of…

Computer Vision and Pattern Recognition · Computer Science 2020-02-13 Chuanjian Liu , Kai Han , Yunhe Wang , Hanting Chen , Qi Tian , Chunjing Xu

Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization

Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks - for a…

Machine Learning · Computer Science 2019-02-07 Eldad Meller , Alexander Finkelstein , Uri Almog , Mark Grobman

And the Bit Goes Down: Revisiting the Quantization of Neural Networks

In this paper, we address the problem of reducing the memory footprint of convolutional network architectures. We introduce a vector quantization method that aims at preserving the quality of the reconstruction of the network outputs rather…

Computer Vision and Pattern Recognition · Computer Science 2020-11-10 Pierre Stock , Armand Joulin , Rémi Gribonval , Benjamin Graham , Hervé Jégou

Quantization Networks

Although deep neural networks are highly effective, their high computational and memory costs severely challenge their applications on portable devices. As a consequence, low-bit quantization, which converts a full-precision neural network…

Computer Vision and Pattern Recognition · Computer Science 2019-12-02 Jiwei Yang , Xu Shen , Jun Xing , Xinmei Tian , Houqiang Li , Bing Deng , Jianqiang Huang , Xiansheng Hua

Effective Quantization Methods for Recurrent Neural Networks

Reducing bit-widths of weights, activations, and gradients of a Neural Network can shrink its storage size and memory usage, and also allow for faster training and inference by exploiting bitwise operations. However, previous attempts for…

Machine Learning · Computer Science 2016-12-01 Qinyao He , He Wen , Shuchang Zhou , Yuxin Wu , Cong Yao , Xinyu Zhou , Yuheng Zou

A Survey on Methods and Theories of Quantized Neural Networks

Deep neural networks are the state-of-the-art methods for many real-world tasks, such as computer vision, natural language processing and speech recognition. For all its popularity, deep neural networks are also criticized for consuming a…

Machine Learning · Computer Science 2018-12-18 Yunhui Guo

Differentiable, Bit-shifting, and Scalable Quantization without training neural network from scratch

Quantization of neural networks provides benefits of inference in less compute and memory requirements. Previous work in quantization lack two important aspects which this work provides. First almost all previous work in quantization used a…

Computer Vision and Pattern Recognition · Computer Science 2025-12-12 Zia Badar

Bit Efficient Quantization for Deep Neural Networks

Quantization for deep neural networks have afforded models for edge devices that use less on-board memory and enable efficient low-power inference. In this paper, we present a comparison of model-parameter driven quantization approaches…

Computer Vision and Pattern Recognition · Computer Science 2019-10-14 Prateeth Nayak , David Zhang , Sek Chai

Neural Network Quantization for Efficient Inference: A Survey

As neural networks have become more powerful, there has been a rising desire to deploy them in the real world; however, the power and accuracy of neural networks is largely due to their depth and complexity, making them difficult to deploy,…

Machine Learning · Computer Science 2023-01-19 Olivia Weng

Attacking Binarized Neural Networks

Neural networks with low-precision weights and activations offer compelling efficiency advantages over their full-precision equivalents. The two most frequently discussed benefits of quantization are reduced memory consumption, and a faster…

Machine Learning · Computer Science 2018-02-01 Angus Galloway , Graham W. Taylor , Medhat Moussa

A Survey of Quantization Methods for Efficient Neural Network Inference

As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related…

Computer Vision and Pattern Recognition · Computer Science 2021-06-23 Amir Gholami , Sehoon Kim , Zhen Dong , Zhewei Yao , Michael W. Mahoney , Kurt Keutzer