Related papers: Convolutional Neural Network Quantization using Ge…

Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks

Quantized Neural Networks (QNNs), which use low bitwidth numbers for representing parameters and performing computations, have been proposed to reduce the computation complexity, storage size and memory usage. In QNNs, parameters and…

Computer Vision and Pattern Recognition · Computer Science 2017-06-23 Shuchang Zhou , Yuzhi Wang , He Wen , Qinyao He , Yuheng Zou

Fixed-point Quantization of Convolutional Neural Networks for Quantized Inference on Embedded Platforms

Convolutional Neural Networks (CNNs) have proven to be a powerful state-of-the-art method for image classification tasks. One drawback however is the high computational complexity and high memory consumption of CNNs which makes them…

Computer Vision and Pattern Recognition · Computer Science 2021-02-04 Rishabh Goyal , Joaquin Vanschoren , Victor van Acht , Stephan Nijssen

Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks

Quantized neural networks typically require smaller memory footprints and lower computation complexity, which is crucial for efficient deployment. However, quantization inevitably leads to a distribution divergence from the original…

Computer Vision and Pattern Recognition · Computer Science 2022-05-30 Runpei Dong , Zhanhong Tan , Mengdi Wu , Linfeng Zhang , Kaisheng Ma

Quantized Convolutional Neural Networks Through the Lens of Partial Differential Equations

Quantization of Convolutional Neural Networks (CNNs) is a common approach to ease the computational burden involved in the deployment of CNNs, especially on low-resource edge devices. However, fixed-point arithmetic is not natural to the…

Machine Learning · Computer Science 2024-06-14 Ido Ben-Yair , Gil Ben Shalom , Moshe Eliasof , Eran Treister

Exploring Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators

Energy efficiency and memory footprint of a convolutional neural network (CNN) implemented on a CNN inference accelerator depend on many factors, including a weight quantization strategy (i.e., data types and bit-widths) and mapping (i.e.,…

Hardware Architecture · Computer Science 2025-07-23 Jan Klhufek , Miroslav Safar , Vojtech Mrazek , Zdenek Vasicek , Lukas Sekanina

FQ-Conv: Fully Quantized Convolution for Efficient and Accurate Inference

Deep neural networks (DNNs) can be made hardware-efficient by reducing the numerical precision of the weights and activations of the network and by improving the network's resilience to noise. However, this gain in efficiency often comes at…

Machine Learning · Computer Science 2019-12-20 Bram-Ernst Verhoef , Nathan Laubeuf , Stefan Cosemans , Peter Debacker , Ioannis Papistas , Arindam Mallik , Diederik Verkest

Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation

With pervasive applications of medical imaging in health-care, biomedical image segmentation plays a central role in quantitative analysis, clinical diagno- sis, and medical intervention. Since manual anno- tation su ers limited…

Computer Vision and Pattern Recognition · Computer Science 2018-03-14 Xiaowei Xu , Qing Lu , Yu Hu , Lin Yang , Sharon Hu , Danny Chen , Yiyu Shi

Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are state-of-the-art in numerous computer vision tasks such as object classification and detection. However, the large amount of parameters they contain leads to a high computational complexity and…

Machine Learning · Computer Science 2019-01-01 Ghouthi Boukli Hacene , Vincent Gripon , Matthieu Arzel , Nicolas Farrugia , Yoshua Bengio

Quantization of Deep Neural Networks for Accurate Edge Computing

Deep neural networks (DNNs) have demonstrated their great potential in recent years, exceeding the per-formance of human experts in a wide range of applications. Due to their large sizes, however, compressiontechniques such as weight…

Computer Vision and Pattern Recognition · Computer Science 2021-10-15 Wentao Chen , Hailong Qiu , Jian Zhuang , Chutong Zhang , Yu Hu , Qing Lu , Tianchen Wang , Yiyu Shi , Meiping Huang , Xiaowe Xu

Efficient Error-Tolerant Quantized Neural Network Accelerators

Neural Networks are currently one of the most widely deployed machine learning algorithms. In particular, Convolutional Neural Networks (CNNs), are gaining popularity and are evaluated for deployment in safety critical applications such as…

Signal Processing · Electrical Eng. & Systems 2019-12-17 Giulio Gambardella , Johannes Kappauf , Michaela Blott , Christoph Doehring , Martin Kumm , Peter Zipf , Kees Vissers

Quantized Convolutional Neural Networks for Mobile Devices

Recently, convolutional neural networks (CNN) have demonstrated impressive performance in various computer vision tasks. However, high performance hardware is typically indispensable for the application of CNN models due to the high…

Computer Vision and Pattern Recognition · Computer Science 2016-05-17 Jiaxiang Wu , Cong Leng , Yuhang Wang , Qinghao Hu , Jian Cheng

Adaptive Quantization for Deep Neural Network

In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large…

Machine Learning · Computer Science 2017-12-05 Yiren Zhou , Seyed-Mohsen Moosavi-Dezfooli , Ngai-Man Cheung , Pascal Frossard

Haar Wavelet Feature Compression for Quantized Graph Convolutional Networks

Graph Convolutional Networks (GCNs) are widely used in a variety of applications, and can be seen as an unstructured version of standard Convolutional Neural Networks (CNNs). As in CNNs, the computational cost of GCNs for large input graphs…

Computer Vision and Pattern Recognition · Computer Science 2021-10-12 Moshe Eliasof , Benjamin Bodner , Eran Treister

Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks

The large computing and memory cost of deep neural networks (DNNs) often precludes their use in resource-constrained devices. Quantizing the parameters and operations to lower bit-precision offers substantial memory and energy savings for…

Machine Learning · Computer Science 2023-09-01 Clemens JS Schaefer , Siddharth Joshi , Shan Li , Raul Blazquez

Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware

Convolutional Neural Networks (CNN) has become more popular choice for various tasks such as computer vision, speech recognition and natural language processing. Thanks to their large computational capability and throughput, GPUs ,which are…

Machine Learning · Computer Science 2018-11-28 Natan Liss , Chaim Baskin , Avi Mendelson , Alex M. Bronstein , Raja Giryes

VWA: Hardware Efficient Vectorwise Accelerator for Convolutional Neural Network

Hardware accelerators for convolution neural networks (CNNs) enable real-time applications of artificial intelligence technology. However, most of the existing designs suffer from low hardware utilization or high area cost due to complex…

Hardware Architecture · Computer Science 2022-05-06 Kuo-Wei Chang , Tian-Sheuan Chang

Quantization of Deep Neural Networks for Accumulator-constrained Processors

We introduce an Artificial Neural Network (ANN) quantization methodology for platforms without wide accumulation registers. This enables fixed-point model deployment on embedded compute platforms that are not specifically designed for large…

Computer Vision and Pattern Recognition · Computer Science 2020-04-27 Barry de Bruin , Zoran Zivkovic , Henk Corporaal

A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification

Recent advancements in machine learning achieved by Deep Neural Networks (DNNs) have been significant. While demonstrating high accuracy, DNNs are associated with a huge number of parameters and computations, which leads to high memory…

Machine Learning · Computer Science 2023-12-20 Babak Rokh , Ali Azarpeyvand , Alireza Khanteymoori

Efficient Mixed Precision Quantization in Graph Neural Networks

Graph Neural Networks (GNNs) have become essential for handling large-scale graph applications. However, the computational demands of GNNs necessitate the development of efficient methods to accelerate inference. Mixed precision…

Machine Learning · Computer Science 2025-05-15 Samir Moustafa , Nils M. Kriege , Wilfried N. Gansterer

Quality Scalable Quantization Methodology for Deep Learning on Edge

Deep Learning Architectures employ heavy computations and bulk of the computational energy is taken up by the convolution operations in the Convolutional Neural Networks. The objective of our proposed work is to reduce the energy…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-17 Salman Abdul Khaliq , Rehan Hafiz