English
Related papers

Related papers: DNQ: Dynamic Network Quantization

200 papers

With the development of deep neural networks, the size of network models becomes larger and larger. Model compression has become an urgent need for deploying these network models to mobile or embedded devices. Model quantization is a…

Machine Learning · Computer Science 2019-07-02 Wen-Pu Cai , Wu-Jun Li

Although weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of…

Computer Vision and Pattern Recognition · Computer Science 2018-07-27 Dongqing Zhang , Jiaolong Yang , Dongqiangzi Ye , Gang Hua

Network quantization is an effective solution to compress deep neural networks for practical usage. Existing network quantization methods cannot sufficiently exploit the depth information to generate low-bit compressed network. In this…

Machine Learning · Computer Science 2018-12-18 Yuhui Xu , Yongzhuang Wang , Aojun Zhou , Weiyao Lin , Hongkai Xiong

Quantizing weights and activations of deep neural networks is essential for deploying them in resource-constrained devices, or cloud platforms for at-scale services. While binarization is a special case of quantization, this extreme case…

Computer Vision and Pattern Recognition · Computer Science 2021-04-02 Phuoc Pham , Jacob Abraham , Jaeyong Chung

Hardware-friendly network quantization (e.g., binary/uniform quantization) can efficiently accelerate the inference and meanwhile reduce memory consumption of the deep neural networks, which is crucial for model deployment on…

Computer Vision and Pattern Recognition · Computer Science 2019-08-15 Ruihao Gong , Xianglong Liu , Shenghu Jiang , Tianxiang Li , Peng Hu , Jiazhen Lin , Fengwei Yu , Junjie Yan

In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large…

Machine Learning · Computer Science 2017-12-05 Yiren Zhou , Seyed-Mohsen Moosavi-Dezfooli , Ngai-Man Cheung , Pascal Frossard

Deep neural networks have achieved state-of-the art performance on various computer vision tasks. However, their deployment on resource-constrained devices has been hindered due to their high computational and storage complexity. While…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Hassan Dbouk , Hetul Sanghvi , Mahesh Mehendale , Naresh Shanbhag

We address the problem of network quantization, that is, reducing bit-widths of weights and/or activations to lighten network architectures. Quantization methods use a rounding function to map full-precision values to the nearest quantized…

Computer Vision and Pattern Recognition · Computer Science 2021-08-17 Dohyung kim , Junghyup Lee , Bumsub Ham

In deep neural networks (DNNs), there are a huge number of weights and multiply-and-accumulate (MAC) operations. Accordingly, it is challenging to apply DNNs on resource-constrained platforms, e.g., mobile phones. Quantization is a method…

Machine Learning · Computer Science 2022-11-29 Wenhao Sun , Grace Li Zhang , Huaxi Gu , Bing Li , Ulf Schlichtmann

Quantization is a technique for creating efficient Deep Neural Networks (DNNs), which involves performing computations and storing tensors at lower bit-widths than f32 floating point precision. Quantization reduces model size and inference…

Machine Learning · Computer Science 2023-10-02 Eliska Kloberdanz , Wei Le

Neural network quantization is becoming an industry standard to efficiently deploy deep learning models on hardware platforms, such as CPU, GPU, TPU, and FPGAs. However, we observe that the conventional quantization approaches are…

Machine Learning · Computer Science 2019-04-19 Ji Lin , Chuang Gan , Song Han

Operating deep neural networks (DNNs) on devices with limited resources requires the reduction of their memory as well as computational footprint. Popular reduction methods are network quantization or pruning, which either reduce the word…

Deep neural networks (DNNs) can be made hardware-efficient by reducing the numerical precision of the weights and activations of the network and by improving the network's resilience to noise. However, this gain in efficiency often comes at…

Large-scale deep neural networks (DNNs) have achieved remarkable success in many application scenarios. However, high computational complexity and energy costs of modern DNNs make their deployment on edge devices challenging. Model…

Machine Learning · Computer Science 2024-04-29 Cédric Gernigon , Silviu-Ioan Filip , Olivier Sentieys , Clément Coggiola , Mickael Bruno

Efficient deep neural network (DNN) inference on mobile or embedded devices typically involves quantization of the network parameters and activations. In particular, mixed precision networks achieve better performance than networks with…

Model quantization is challenging due to many tedious hyper-parameters such as precision (bitwidth), dynamic range (minimum and maximum discrete values) and stepsize (interval between discrete values). Unlike prior arts that carefully tune…

Machine Learning · Computer Science 2021-07-08 Zhang Zhaoyang , Shao Wenqi , Gu Jinwei , Wang Xiaogang , Luo Ping

Quantizing deep networks with adaptive bit-widths is a promising technique for efficient inference across many devices and resource constraints. In contrast to static methods that repeat the quantization process and train different models…

Computer Vision and Pattern Recognition · Computer Science 2021-09-20 Ximeng Sun , Rameswar Panda , Chun-Fu Chen , Naigang Wang , Bowen Pan , Kailash Gopalakrishnan , Aude Oliva , Rogerio Feris , Kate Saenko

Low-bit quantization of network weights and activations can drastically reduce the memory footprint, complexity, energy consumption and latency of Deep Neural Networks (DNNs). However, low-bit quantization can also cause a considerable drop…

Computer Vision and Pattern Recognition · Computer Science 2021-03-25 Ghouthi Boukli Hacene , Lukas Mauch , Stefan Uhlich , Fabien Cardinaux

Deep Learning methods have been adopted in mobile networks, especially for network management automation where they provide means for advanced machine cognition. Deep learning methods utilize cutting-edge hardware and software tools,…

Machine Learning · Computer Science 2021-03-09 Marton Kajo , Stephen S. Mwanje , Benedek Schultz , Georg Carle

Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed. Robust quantization offers…

Machine Learning · Computer Science 2020-10-23 Moran Shkolnik , Brian Chmiel , Ron Banner , Gil Shomron , Yury Nahshan , Alex Bronstein , Uri Weiser
‹ Prev 1 2 3 10 Next ›