English
Related papers

Related papers: Gradient $\ell_1$ Regularization for Quantization …

200 papers

Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a…

Deep Neural Networks reached state-of-the-art performance across numerous domains, but this progress has come at the cost of increasingly large and over-parameterized models, posing serious challenges for deployment on resource-constrained…

Machine Learning · Computer Science 2026-02-04 Dario Malchiodi , Mattia Ferraretto , Marco Frasca

Neural networks are getting deeper and more computation-intensive nowadays. Quantization is a useful technique in deploying neural networks on hardware platforms and saving computation costs with negligible performance loss. However, recent…

Machine Learning · Computer Science 2021-01-26 Chang Song , Elias Fallon , Hai Li

While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized…

Machine Learning · Computer Science 2023-01-18 Jinjie Zhang , Yixuan Zhou , Rayan Saab

Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or…

Machine Learning · Computer Science 2020-03-17 Yury Nahshan , Brian Chmiel , Chaim Baskin , Evgenii Zheltonozhskii , Ron Banner , Alex M. Bronstein , Avi Mendelson

Despite the growing availability of high-capacity computational platforms, implementation complexity still has been a great concern for the real-world deployment of neural networks. This concern is not exclusively due to the huge costs of…

Machine Learning · Computer Science 2023-12-19 Felipe Dennis de Resende Oliveira , Eduardo Luiz Ortiz Batista , Rui Seara

Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed. Robust quantization offers…

Machine Learning · Computer Science 2020-10-23 Moran Shkolnik , Brian Chmiel , Ron Banner , Gil Shomron , Yury Nahshan , Alex Bronstein , Uri Weiser

Reducing bit-widths of weights, activations, and gradients of a Neural Network can shrink its storage size and memory usage, and also allow for faster training and inference by exploiting bitwise operations. However, previous attempts for…

Machine Learning · Computer Science 2016-12-01 Qinyao He , He Wen , Shuchang Zhou , Yuxin Wu , Cong Yao , Xinyu Zhou , Yuheng Zou

Reducing bit-widths of activations and weights of deep networks makes it efficient to compute and store them in memory, which is crucial in their deployments to resource-limited devices, such as mobile phones. However, decreasing bit-widths…

Computer Vision and Pattern Recognition · Computer Science 2018-11-26 Sangil Jung , Changyong Son , Seohyung Lee , Jinwoo Son , Youngjun Kwak , Jae-Joon Han , Sung Ju Hwang , Changkyu Choi

Regularization is typically understood as improving generalization by altering the landscape of local extrema to which the model eventually converges. Deep neural networks (DNNs), however, challenge this view: We show that removing…

Machine Learning · Computer Science 2019-06-03 Aditya Golatkar , Alessandro Achille , Stefano Soatto

We challenge the prevailing view that weight oscillations observed during Quantization Aware Training (QAT) are merely undesirable side-effects and argue instead that they are an essential part of QAT. We show in a univariate linear model…

Machine Learning · Computer Science 2025-12-10 Jonathan Wenshøj , Bob Pepin , Raghavendra Selvan

Network quantization generally converts full-precision weights and/or activations into low-bit fixed-point values in order to accelerate an inference process. Recent approaches to network quantization further discretize the gradients into…

Computer Vision and Pattern Recognition · Computer Science 2024-07-18 Dohyung Kim , Junghyup Lee , Jeimin Jeon , Jaehyeon Moon , Bumsub Ham

In this paper we study the effects of quantization in DNN training. We hypothesize that weight quantization is a form of regularization and the amount of regularization is correlated with the quantization level (precision). We confirm our…

Current model quantization methods have shown their promising capability in reducing storage space and computation complexity. However, due to the diversity of quantization forms supported by different hardware, one limitation of existing…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Ke Xu , Lei Han , Ye Tian , Shangshang Yang , Xingyi Zhang

Although deep neural networks are highly effective, their high computational and memory costs severely challenge their applications on portable devices. As a consequence, low-bit quantization, which converts a full-precision neural network…

Computer Vision and Pattern Recognition · Computer Science 2019-12-02 Jiwei Yang , Xu Shen , Jun Xing , Xinmei Tian , Houqiang Li , Bing Deng , Jianqiang Huang , Xiansheng Hua

Quantization, a commonly used technique to reduce the memory footprint of a neural network for edge computing, entails reducing the precision of the floating-point representation used for the parameters of the network. The impact of such…

Machine Learning · Computer Science 2019-03-27 Abhishek Murthy , Himel Das , Md Ariful Islam

Neural network quantization has become an important research area due to its great impact on deployment of large models on resource constrained devices. In order to train networks that can be effectively discretized without loss of…

Machine Learning · Computer Science 2018-10-05 Christos Louizos , Matthias Reisser , Tijmen Blankevoort , Efstratios Gavves , Max Welling

Robust quantization improves the tolerance of networks for various implementations, allowing reliable output in different bit-widths or fragmented low-precision arithmetic. In this work, we perform extensive analyses to identify the sources…

Machine Learning · Computer Science 2022-08-02 Sein Park , Yeongsang Jang , Eunhyeok Park

Quantization reduces computation costs of neural networks but suffers from performance degeneration. Is this accuracy drop due to the reduced capacity, or inefficient training during the quantization procedure? After looking into the…

Computer Vision and Pattern Recognition · Computer Science 2019-12-24 Qing Jin , Linjie Yang , Zhenyu Liao

Adversarial training is an effective methodology for training deep neural networks that are robust against adversarial, norm-bounded perturbations. However, the computational cost of adversarial training grows prohibitively as the size of…

‹ Prev 1 2 3 10 Next ›