English
Related papers

Related papers: SplitQuant: Layer Splitting for Low-Bit Neural Net…

200 papers

Deep Neural Networks(DNNs) have many parameters and activation data, and these both are expensive to implement. One method to reduce the size of the DNN is to quantize the pre-trained model by using a low-bit expression for weights and…

Computer Vision and Pattern Recognition · Computer Science 2020-11-26 Jun Nishikawa , Ryoji Ikegaya

Quantization is a technique for creating efficient Deep Neural Networks (DNNs), which involves performing computations and storing tensors at lower bit-widths than f32 floating point precision. Quantization reduces model size and inference…

Machine Learning · Computer Science 2023-10-02 Eliska Kloberdanz , Wei Le

Quantization can improve the execution latency and energy efficiency of neural networks on both commodity GPUs and specialized accelerators. The majority of existing literature focuses on training quantized DNNs, while this work examines…

Machine Learning · Computer Science 2019-05-24 Ritchie Zhao , Yuwei Hu , Jordan Dotzel , Christopher De Sa , Zhiru Zhang

The quantization of large language models (LLMs) is crucial for deploying them on devices with limited computational resources. While advanced quantization algorithms offer improved performance compared to the basic linear quantization,…

Machine Learning · Computer Science 2025-03-12 Jaewoo Song , Fangzhen Lin

Deep neural networks (DNNs) offer the highest performance in a wide range of applications in computer vision. These results rely on over-parameterized backbones, which are expensive to run. This computational burden can be dramatically…

Computer Vision and Pattern Recognition · Computer Science 2023-07-03 Edouard Yvinec , Arnaud Dapogny , Kevin Bailly

Low-bit deep neural networks (DNNs) become critical for embedded applications due to their low storage requirement and computing efficiency. However, they suffer much from the non-negligible accuracy drop. This paper proposes the stochastic…

Computer Vision and Pattern Recognition · Computer Science 2017-08-04 Yinpeng Dong , Renkun Ni , Jianguo Li , Yurong Chen , Jun Zhu , Hang Su

Quantization for deep neural networks (DNN) have enabled developers to deploy models with less memory and more efficient low-power inference. However, not all DNN designs are friendly to quantization. For example, the popular Mobilenet…

Computer Vision and Pattern Recognition · Computer Science 2020-11-17 Thu Dinh , Andrey Melnikov , Vasilios Daskalopoulos , Sek Chai

The large computing and memory cost of deep neural networks (DNNs) often precludes their use in resource-constrained devices. Quantizing the parameters and operations to lower bit-precision offers substantial memory and energy savings for…

Machine Learning · Computer Science 2023-09-01 Clemens JS Schaefer , Siddharth Joshi , Shan Li , Raul Blazquez

Quantization is essential for reducing the computational cost and memory usage of deep neural networks, enabling efficient inference on low-precision hardware. Despite the growing adoption of uniform and floating-point quantization schemes,…

Machine Learning · Statistics 2026-05-19 Mehmet Aktukmak , Daniel Huang , Ke Ding

Recent advancements in machine learning achieved by Deep Neural Networks (DNNs) have been significant. While demonstrating high accuracy, DNNs are associated with a huge number of parameters and computations, which leads to high memory…

Machine Learning · Computer Science 2023-12-20 Babak Rokh , Ali Azarpeyvand , Alireza Khanteymoori

Deep Neural Networks (DNNs) have drawn attention because of their outstanding performance on various tasks. However, deploying full-fledged DNNs in resource-constrained devices (edge, mobile, IoT) is difficult due to their large size. To…

Machine Learning · Computer Science 2023-09-19 Divya J. Bajpai , Vivek K. Trivedi , Sohan L. Yadav , Manjesh K. Hanawal

In deep neural networks (DNNs), there are a huge number of weights and multiply-and-accumulate (MAC) operations. Accordingly, it is challenging to apply DNNs on resource-constrained platforms, e.g., mobile phones. Quantization is a method…

Machine Learning · Computer Science 2022-11-29 Wenhao Sun , Grace Li Zhang , Huaxi Gu , Bing Li , Ulf Schlichtmann

Convolutional Neural Networks (CNNs) have proven to be a powerful state-of-the-art method for image classification tasks. One drawback however is the high computational complexity and high memory consumption of CNNs which makes them…

Computer Vision and Pattern Recognition · Computer Science 2021-02-04 Rishabh Goyal , Joaquin Vanschoren , Victor van Acht , Stephan Nijssen

Quantized Neural Networks (QNNs), which use low bitwidth numbers for representing parameters and performing computations, have been proposed to reduce the computation complexity, storage size and memory usage. In QNNs, parameters and…

Computer Vision and Pattern Recognition · Computer Science 2017-06-23 Shuchang Zhou , Yuzhi Wang , He Wen , Qinyao He , Yuheng Zou

Quantization of large language models (LLMs) faces significant challenges, particularly due to the presence of outlier activations that impede efficient low-bit representation. Traditional approaches predominantly address Normal Outliers,…

Computation and Language · Computer Science 2024-11-04 Haokun Lin , Haobo Xu , Yichen Wu , Jingzhi Cui , Yingtao Zhang , Linzhan Mou , Linqi Song , Zhenan Sun , Ying Wei

Quantizing the weights of a neural network has two steps: (1) Finding a good low bit-complexity representation for weights (which we call the quantization grid) and (2) Rounding the original weights to values in the quantization grid. In…

Machine Learning · Computer Science 2025-01-14 Jerry Chee , Arturs Backurs , Rainie Heck , Li Zhang , Janardhan Kulkarni , Thomas Rothvoss , Sivakanth Gopi

Quantization of deep neural networks is a promising approach that reduces the inference cost, making it feasible to run deep networks on resource-restricted devices. Inspired by existing methods, we propose a new framework to learn the…

Machine Learning · Computer Science 2022-02-28 Amir Ardakani , Arash Ardakani , Brett Meyer , James J. Clark , Warren J. Gross

With pervasive applications of medical imaging in health-care, biomedical image segmentation plays a central role in quantitative analysis, clinical diagno- sis, and medical intervention. Since manual anno- tation su ers limited…

Computer Vision and Pattern Recognition · Computer Science 2018-03-14 Xiaowei Xu , Qing Lu , Yu Hu , Lin Yang , Sharon Hu , Danny Chen , Yiyu Shi

Deep neural networks (DNNs) are ubiquitous in computer vision and natural language processing, but suffer from high inference cost. This problem can be addressed by quantization, which consists in converting floating point perations into a…

Computer Vision and Pattern Recognition · Computer Science 2023-11-28 Edouard Yvinec , Arnaud Dapogny , Kevin Bailly

Neural network training is a memory- and compute-intensive task. Quantization, which enables low-bitwidth formats in training, can significantly mitigate the workload. To reduce quantization error, recent methods have developed new data…

Machine Learning · Computer Science 2024-11-19 Wenjin Guo , Donglai Liu , Weiying Xie , Yunsong Li , Xuefei Ning , Zihan Meng , Shulin Zeng , Jie Lei , Zhenman Fang , Yu Wang
‹ Prev 1 2 3 10 Next ›