English

DNQ: Dynamic Network Quantization

Machine Learning 2018-12-07 v1 Computer Vision and Pattern Recognition

Abstract

Network quantization is an effective method for the deployment of neural networks on memory and energy constrained mobile devices. In this paper, we propose a Dynamic Network Quantization (DNQ) framework which is composed of two modules: a bit-width controller and a quantizer. Unlike most existing quantization methods that use a universal quantization bit-width for the whole network, we utilize policy gradient to train an agent to learn the bit-width of each layer by the bit-width controller. This controller can make a trade-off between accuracy and compression ratio. Given the quantization bit-width sequence, the quantizer adopts the quantization distance as the criterion of the weights importance during quantization. We extensively validate the proposed approach on various main-stream neural networks and obtain impressive results.

Keywords

Cite

@article{arxiv.1812.02375,
  title  = {DNQ: Dynamic Network Quantization},
  author = {Yuhui Xu and Shuai Zhang and Yingyong Qi and Jiaxian Guo and Weiyao Lin and Hongkai Xiong},
  journal= {arXiv preprint arXiv:1812.02375},
  year   = {2018}
}