Standard Deviation-Based Quantization for Deep Neural Networks

Amir Ardakani; Arash Ardakani; Brett Meyer; James J. Clark; Warren J. Gross

Standard Deviation-Based Quantization for Deep Neural Networks

Machine Learning 2022-02-28 v1 Artificial Intelligence

Authors: Amir Ardakani , Arash Ardakani , Brett Meyer , James J. Clark , Warren J. Gross

Abstract

Quantization of deep neural networks is a promising approach that reduces the inference cost, making it feasible to run deep networks on resource-restricted devices. Inspired by existing methods, we propose a new framework to learn the quantization intervals (discrete values) using the knowledge of the network's weight and activation distributions, i.e., standard deviation. Furthermore, we propose a novel base-2 logarithmic quantization scheme to quantize weights to power-of-two discrete values. Our proposed scheme allows us to replace resource-hungry high-precision multipliers with simple shift-add operations. According to our evaluations, our method outperforms existing work on CIFAR10 and ImageNet datasets and even achieves better accuracy performance with 3-bit weights and activations when compared to the full-precision models. Moreover, our scheme simultaneously prunes the network's parameters and allows us to flexibly adjust the pruning ratio during the quantization process.

Keywords

quantization neural network training binary neural network

Cite

@article{arxiv.2202.12422,
  title  = {Standard Deviation-Based Quantization for Deep Neural Networks},
  author = {Amir Ardakani and Arash Ardakani and Brett Meyer and James J. Clark and Warren J. Gross},
  journal= {arXiv preprint arXiv:2202.12422},
  year   = {2022}
}

Standard Deviation-Based Quantization for Deep Neural Networks

Abstract

Keywords

Cite

Related papers