English
Related papers

Related papers: One-Bit Quantization for Random Features Models

200 papers

Model compression has gained a lot of attention due to its ability to reduce hardware resource requirements significantly while maintaining accuracy of DNNs. Model compression is especially useful for memory-intensive recurrent neural…

Machine Learning · Computer Science 2018-05-30 Dongsoo Lee , Byeongwook Kim

Compression is a key step to deploy large neural networks on resource-constrained platforms. As a popular compression technique, quantization constrains the number of distinct weight values and thus reducing the number of bits required to…

Machine Learning · Computer Science 2019-01-15 Yukun Ding , Jinglan Liu , Jinjun Xiong , Yiyu Shi

Weight quantization for deep ConvNets has shown promising results for applications such as image classification and semantic segmentation and is especially important for applications where memory storage is limited. However, when aiming for…

Machine Learning · Computer Science 2020-09-01 Ting-Wu Chin , Pierce I-Jen Chuang , Vikas Chandra , Diana Marculescu

Compressing large neural networks is an important step for their deployment in resource-constrained computational platforms. In this context, vector quantization is an appealing framework that expresses multiple parameters using a single…

Computer Vision and Pattern Recognition · Computer Science 2021-04-13 Julieta Martinez , Jashan Shewakramani , Ting Wei Liu , Ioan Andrei Bârsan , Wenyuan Zeng , Raquel Urtasun

Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a…

Quantization for deep neural networks have afforded models for edge devices that use less on-board memory and enable efficient low-power inference. In this paper, we present a comparison of model-parameter driven quantization approaches…

Computer Vision and Pattern Recognition · Computer Science 2019-10-14 Prateeth Nayak , David Zhang , Sek Chai

Efficient model inference is an important and practical issue in the deployment of deep neural network on resource constraint platforms. Network quantization addresses this problem effectively by leveraging low-bit representation and…

Computer Vision and Pattern Recognition · Computer Science 2020-01-01 Tianshu Chu , Qin Luo , Jie Yang , Xiaolin Huang

In this short note, we propose a new method for quantizing the weights of a fully trained neural network. A simple deterministic pre-processing step allows us to quantize network layers via memoryless scalar quantization while preserving…

Machine Learning · Computer Science 2023-04-06 Johannes Maly , Rayan Saab

Convolutional Neural Networks (CNNs) have proven to be a powerful state-of-the-art method for image classification tasks. One drawback however is the high computational complexity and high memory consumption of CNNs which makes them…

Computer Vision and Pattern Recognition · Computer Science 2021-02-04 Rishabh Goyal , Joaquin Vanschoren , Victor van Acht , Stephan Nijssen

Model quantification uses low bit-width values to represent the weight matrices of existing models to be quantized, which is a promising approach to reduce both storage and computational overheads of deploying highly anticipated LLMs.…

Computation and Language · Computer Science 2024-12-02 Yuzhuang Xu , Xu Han , Zonghan Yang , Shuo Wang , Qingfu Zhu , Zhiyuan Liu , Weidong Liu , Wanxiang Che

The role of quantization within implicit/coordinate neural networks is still not fully understood. We note that using a canonical fixed quantization scheme during training produces poor performance at low-rates due to the network weight…

Computer Vision and Pattern Recognition · Computer Science 2024-10-14 Cameron Gordon , Shin-Fang Chng , Lachlan MacDonald , Simon Lucey

We introduce a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection. It achieves near-original model performance on common computer vision architectures and tasks. 8-bit…

Machine Learning · Computer Science 2019-11-26 Markus Nagel , Mart van Baalen , Tijmen Blankevoort , Max Welling

In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large…

Machine Learning · Computer Science 2017-12-05 Yiren Zhou , Seyed-Mohsen Moosavi-Dezfooli , Ngai-Man Cheung , Pascal Frossard

We introduce a quantization-aware training algorithm that guarantees avoiding numerical overflow when reducing the precision of accumulators during inference. We leverage weight normalization as a means of constraining parameters during…

Machine Learning · Computer Science 2023-02-01 Ian Colbert , Alessandro Pappalardo , Jakoba Petri-Koenig

While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized…

Machine Learning · Computer Science 2023-01-18 Jinjie Zhang , Yixuan Zhou , Rayan Saab

Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks - for a…

Machine Learning · Computer Science 2019-02-07 Eldad Meller , Alexander Finkelstein , Uri Almog , Mark Grobman

Quantization of neural networks provides benefits of inference in less compute and memory requirements. Previous work in quantization lack two important aspects which this work provides. First almost all previous work in quantization used a…

Computer Vision and Pattern Recognition · Computer Science 2025-12-12 Zia Badar

Although deep neural networks are highly effective, their high computational and memory costs severely challenge their applications on portable devices. As a consequence, low-bit quantization, which converts a full-precision neural network…

Computer Vision and Pattern Recognition · Computer Science 2019-12-02 Jiwei Yang , Xu Shen , Jun Xing , Xinmei Tian , Houqiang Li , Bing Deng , Jianqiang Huang , Xiansheng Hua

Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators. However, the quantization functions used in most conventional quantization methods are non-differentiable, which increases the…

Computer Vision and Pattern Recognition · Computer Science 2020-09-21 Zhaohui Yang , Yunhe Wang , Kai Han , Chunjing Xu , Chao Xu , Dacheng Tao , Chang Xu

Quantizing weights and activations of deep neural networks results in significant improvement in inference efficiency at the cost of lower accuracy. A source of the accuracy gap between full precision and quantized models is the…

Machine Learning · Computer Science 2020-06-16 Hadi Pouransari , Zhucheng Tu , Oncel Tuzel
‹ Prev 1 2 3 10 Next ›