Related papers: Gradient $\ell_1$ Regularization for Quantization …

QGen: On the Ability to Generalize in Quantization Aware Training

Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a…

Machine Learning · Computer Science 2024-04-22 MohammadHossein AskariHemmat , Ahmadreza Jeddi , Reyhane Askari Hemmat , Ivan Lazarevich , Alexander Hoffman , Sudhakar Sah , Ehsan Saboori , Yvon Savaria , Jean-Pierre David

Quantization-Aware Regularizers for Deep Neural Networks Compression

Deep Neural Networks reached state-of-the-art performance across numerous domains, but this progress has come at the cost of increasingly large and over-parameterized models, posing serious challenges for deployment on resource-constrained…

Machine Learning · Computer Science 2026-02-04 Dario Malchiodi , Mattia Ferraretto , Marco Frasca

Improving Adversarial Robustness in Weight-quantized Neural Networks

Neural networks are getting deeper and more computation-intensive nowadays. Quantization is a useful technique in deploying neural networks on hardware platforms and saving computation costs with negligible performance loss. However, recent…

Machine Learning · Computer Science 2021-01-26 Chang Song , Elias Fallon , Hai Li

Post-training Quantization for Neural Networks with Provable Guarantees

While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized…

Machine Learning · Computer Science 2023-01-18 Jinjie Zhang , Yixuan Zhou , Rayan Saab

Loss Aware Post-training Quantization

Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or…

Machine Learning · Computer Science 2020-03-17 Yury Nahshan , Brian Chmiel , Chaim Baskin , Evgenii Zheltonozhskii , Ron Banner , Alex M. Bronstein , Avi Mendelson

On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning

Despite the growing availability of high-capacity computational platforms, implementation complexity still has been a great concern for the real-world deployment of neural networks. This concern is not exclusively due to the huge costs of…

Machine Learning · Computer Science 2023-12-19 Felipe Dennis de Resende Oliveira , Eduardo Luiz Ortiz Batista , Rui Seara

Robust Quantization: One Model to Rule Them All

Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed. Robust quantization offers…

Machine Learning · Computer Science 2020-10-23 Moran Shkolnik , Brian Chmiel , Ron Banner , Gil Shomron , Yury Nahshan , Alex Bronstein , Uri Weiser

Effective Quantization Methods for Recurrent Neural Networks

Reducing bit-widths of weights, activations, and gradients of a Neural Network can shrink its storage size and memory usage, and also allow for faster training and inference by exploiting bitwise operations. However, previous attempts for…

Machine Learning · Computer Science 2016-12-01 Qinyao He , He Wen , Shuchang Zhou , Yuxin Wu , Cong Yao , Xinyu Zhou , Yuheng Zou

Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss

Reducing bit-widths of activations and weights of deep networks makes it efficient to compute and store them in memory, which is crucial in their deployments to resource-limited devices, such as mobile phones. However, decreasing bit-widths…

Computer Vision and Pattern Recognition · Computer Science 2018-11-26 Sangil Jung , Changyong Son , Seohyung Lee , Jinwoo Son , Youngjun Kwak , Jae-Joon Han , Sung Ju Hwang , Changkyu Choi

Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence

Regularization is typically understood as improving generalization by altering the landscape of local extrema to which the model eventually converges. Deep neural networks (DNNs), however, challenge this view: We show that removing…

Machine Learning · Computer Science 2019-06-03 Aditya Golatkar , Alessandro Achille , Stefano Soatto

Oscillations Make Neural Networks Robust to Quantization

We challenge the prevailing view that weight oscillations observed during Quantization Aware Training (QAT) are merely undesirable side-effects and argue instead that they are an essential part of QAT. We show in a univariate linear model…

Machine Learning · Computer Science 2025-12-10 Jonathan Wenshøj , Bob Pepin , Raghavendra Selvan

Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients

Network quantization generally converts full-precision weights and/or activations into low-bit fixed-point values in order to accelerate an inference process. Recent approaches to network quantization further discretize the gradients into…

Computer Vision and Pattern Recognition · Computer Science 2024-07-18 Dohyung Kim , Junghyup Lee , Jeimin Jeon , Jaehyeon Moon , Bumsub Ham

QReg: On Regularization Effects of Quantization

In this paper we study the effects of quantization in DNN training. We hypothesize that weight quantization is a form of regularization and the amount of regularization is correlated with the quantization level (precision). We confirm our…

Computer Vision and Pattern Recognition · Computer Science 2022-08-18 MohammadHossein AskariHemmat , Reyhane Askari Hemmat , Alex Hoffman , Ivan Lazarevich , Ehsan Saboori , Olivier Mastropietro , Sudhakar Sah , Yvon Savaria , Jean-Pierre David

EQ-Net: Elastic Quantization Neural Networks

Current model quantization methods have shown their promising capability in reducing storage space and computation complexity. However, due to the diversity of quantization forms supported by different hardware, one limitation of existing…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Ke Xu , Lei Han , Ye Tian , Shangshang Yang , Xingyi Zhang

Quantization Networks

Although deep neural networks are highly effective, their high computational and memory costs severely challenge their applications on portable devices. As a consequence, low-bit quantization, which converts a full-precision neural network…

Computer Vision and Pattern Recognition · Computer Science 2019-12-02 Jiwei Yang , Xu Shen , Jun Xing , Xinmei Tian , Houqiang Li , Bing Deng , Jianqiang Huang , Xiansheng Hua

Robustness of Neural Networks to Parameter Quantization

Quantization, a commonly used technique to reduce the memory footprint of a neural network for edge computing, entails reducing the precision of the floating-point representation used for the parameters of the network. The impact of such…

Machine Learning · Computer Science 2019-03-27 Abhishek Murthy , Himel Das , Md Ariful Islam

Relaxed Quantization for Discretized Neural Networks

Neural network quantization has become an important research area due to its great impact on deployment of large models on resource constrained devices. In order to train networks that can be effectively discretized without loss of…

Machine Learning · Computer Science 2018-10-05 Christos Louizos , Matthias Reisser , Tijmen Blankevoort , Efstratios Gavves , Max Welling

Symmetry Regularization and Saturating Nonlinearity for Robust Quantization

Robust quantization improves the tolerance of networks for various implementations, allowing reliable output in different bit-widths or fragmented low-precision arithmetic. In this work, we perform extensive analyses to identify the sources…

Machine Learning · Computer Science 2022-08-02 Sein Park , Yeongsang Jang , Eunhyeok Park

Towards Efficient Training for Neural Network Quantization

Quantization reduces computation costs of neural networks but suffers from performance degeneration. Is this accuracy drop due to the reduced capacity, or inefficient training during the quantization procedure? After looking into the…

Computer Vision and Pattern Recognition · Computer Science 2019-12-24 Qing Jin , Linjie Yang , Zhenyu Liao

Adversarial Robustness through Local Linearization

Adversarial training is an effective methodology for training deep neural networks that are robust against adversarial, norm-bounded perturbations. However, the computational cost of adversarial training grows prohibitively as the size of…

Machine Learning · Statistics 2019-10-11 Chongli Qin , James Martens , Sven Gowal , Dilip Krishnan , Krishnamurthy Dvijotham , Alhussein Fawzi , Soham De , Robert Stanforth , Pushmeet Kohli