English
Related papers

Related papers: SinReQ: Generalized Sinusoidal Regularization for …

200 papers

As deep neural networks make their ways into different domains, their compute efficiency is becoming a first-order constraint. Deep quantization, which reduces the bitwidth of the operations (below 8 bits), offers a unique opportunity as it…

Deep Neural Networks (DNNs) typically require massive amount of computation resource in inference tasks for computer vision applications. Quantization can significantly reduce DNN computation and storage by decreasing the bitwidth of…

Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or…

Computer Vision and Pattern Recognition · Computer Science 2018-07-03 Julian Faraone , Nicholas Fraser , Michaela Blott , Philip H. W. Leong

Post-training quantization has emerged as the most widely used strategy for deploying large language models at low precision. Still, current methods show perplexity degradation at bit-widths less than or equal to 4, partly because…

Machine Learning · Computer Science 2026-01-30 Lorenz K. Müller , Philippe Bich , Jiawei Zhuang , Ahmet Çelik , Luca Benfenati , Lukas Cavigelli

Mixed-precision quantization has been widely applied on deep neural networks (DNNs) as it leads to significantly better efficiency-accuracy tradeoffs compared to uniform quantization. Meanwhile, determining the exact precision of each layer…

Computer Vision and Pattern Recognition · Computer Science 2023-03-01 Lirui Xiao , Huanrui Yang , Zhen Dong , Kurt Keutzer , Li Du , Shanghang Zhang

Quantizing weights and activations of deep neural networks is essential for deploying them in resource-constrained devices, or cloud platforms for at-scale services. While binarization is a special case of quantization, this extreme case…

Computer Vision and Pattern Recognition · Computer Science 2021-04-02 Phuoc Pham , Jacob Abraham , Jaeyong Chung

Robust quantization improves the tolerance of networks for various implementations, allowing reliable output in different bit-widths or fragmented low-precision arithmetic. In this work, we perform extensive analyses to identify the sources…

Machine Learning · Computer Science 2022-08-02 Sein Park , Yeongsang Jang , Eunhyeok Park

Low-bit deep neural networks (DNNs) become critical for embedded applications due to their low storage requirement and computing efficiency. However, they suffer much from the non-negligible accuracy drop. This paper proposes the stochastic…

Computer Vision and Pattern Recognition · Computer Science 2017-08-04 Yinpeng Dong , Renkun Ni , Jianguo Li , Yurong Chen , Jun Zhu , Hang Su

Deep neural networks (DNNs) can be made hardware-efficient by reducing the numerical precision of the weights and activations of the network and by improving the network's resilience to noise. However, this gain in efficiency often comes at…

This paper proposes a training method having multiple cyclic training for achieving enhanced performance in low-bit quantized convolutional neural networks (CNNs). Quantization is a popular method for obtaining lightweight CNNs, where the…

Computer Vision and Pattern Recognition · Computer Science 2022-06-28 HyunJin Kim , Jungwoo Shin , Alberto A. Del Barrio

Low-bit quantization of network weights and activations can drastically reduce the memory footprint, complexity, energy consumption and latency of Deep Neural Networks (DNNs). However, low-bit quantization can also cause a considerable drop…

Computer Vision and Pattern Recognition · Computer Science 2021-03-25 Ghouthi Boukli Hacene , Lukas Mauch , Stefan Uhlich , Fabien Cardinaux

Efficient inference is critical for deploying deep learning models on edge AI devices. Low-bit quantization (e.g., 3- and 4-bit) with fixed-point arithmetic improves efficiency, while low-power memory technologies like analog nonvolatile…

Machine Learning · Computer Science 2025-07-15 Anmol Biswas , Raghav Singhal , Sivakumar Elangovan , Shreyas Sabnis , Udayan Ganguly

Reducing bit-widths of activations and weights of deep networks makes it efficient to compute and store them in memory, which is crucial in their deployments to resource-limited devices, such as mobile phones. However, decreasing bit-widths…

Computer Vision and Pattern Recognition · Computer Science 2018-11-26 Sangil Jung , Changyong Son , Seohyung Lee , Jinwoo Son , Youngjun Kwak , Jae-Joon Han , Sung Ju Hwang , Changkyu Choi

Multi-bit quantization networks enable flexible deployment of deep neural networks by supporting multiple precision levels within a single model. However, existing approaches suffer from significant training overhead as full-dataset updates…

Computer Vision and Pattern Recognition · Computer Science 2025-10-24 Jinhee Kim , Jae Jun An , Kang Eun Jeon , Jong Hwan Ko

Hardware-friendly network quantization (e.g., binary/uniform quantization) can efficiently accelerate the inference and meanwhile reduce memory consumption of the deep neural networks, which is crucial for model deployment on…

Computer Vision and Pattern Recognition · Computer Science 2019-08-15 Ruihao Gong , Xianglong Liu , Shenghu Jiang , Tianxiang Li , Peng Hu , Jiazhen Lin , Fengwei Yu , Junjie Yan

This paper presents incremental network quantization (INQ), a novel method, targeting to efficiently convert any pre-trained full-precision convolutional neural network (CNN) model into a low-precision version whose weights are constrained…

Computer Vision and Pattern Recognition · Computer Science 2017-08-28 Aojun Zhou , Anbang Yao , Yiwen Guo , Lin Xu , Yurong Chen

Network quantization, which aims to reduce the bit-lengths of the network weights and activations, has emerged as one of the key ingredients to reduce the size of neural networks for their deployments to resource-limited devices. In order…

Computer Vision and Pattern Recognition · Computer Science 2021-09-08 Jung Hyun Lee , Jihun Yun , Sung Ju Hwang , Eunho Yang

Deep neural networks (DNNs) have achieved great success in a wide range of computer vision areas, but the applications to mobile devices is limited due to their high storage and computational cost. Much efforts have been devoted to compress…

Computer Vision and Pattern Recognition · Computer Science 2019-05-14 Yiming Hu , Jianquan Li , Xianlei Long , Shenhua Hu , Jiagang Zhu , Xingang Wang , Qingyi Gu

This paper tackles the problem of training a deep convolutional neural network of both low-bitwidth weights and activations. Optimizing a low-precision network is very challenging due to the non-differentiability of the quantizer, which may…

Computer Vision and Pattern Recognition · Computer Science 2021-06-07 Bohan Zhuang , Jing Liu , Mingkui Tan , Lingqiao Liu , Ian Reid , Chunhua Shen

Designing a deep neural network (DNN) with good generalization capability is a complex process especially when the weights are severely quantized. Model averaging is a promising approach for achieving the good generalization capability of…

Machine Learning · Computer Science 2020-02-04 Sungho Shin , Yoonho Boo , Wonyong Sung
‹ Prev 1 2 3 10 Next ›