iRNN: Integer-only Recurrent Neural Network

Eyyüb Sari; Vanessa Courville; Vahid Partovi Nia

iRNN: Integer-only Recurrent Neural Network

Machine Learning 2022-02-16 v2 Neural and Evolutionary Computing

Authors: Eyyüb Sari , Vanessa Courville , Vahid Partovi Nia

Abstract

Recurrent neural networks (RNN) are used in many real-world text and speech applications. They include complex modules such as recurrence, exponential-based activation, gate interaction, unfoldable normalization, bi-directional dependence, and attention. The interaction between these elements prevents running them on integer-only operations without a significant performance drop. Deploying RNNs that include layer normalization and attention on integer-only arithmetic is still an open problem. We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN). Our approach supports layer normalization, attention, and an adaptive piecewise linear approximation of activations (PWL), to serve a wide range of RNNs on various applications. The proposed method is proven to work on RNN-based language models and challenging automatic speech recognition, enabling AI applications on the edge. Our iRNN maintains similar performance as its full-precision counterpart, their deployment on smartphones improves the runtime performance by $2\times$ , and reduces the model size by $4\times$ .

Keywords

recurrent neural network binary neural network deep neural network

Cite

@article{arxiv.2109.09828,
  title  = {iRNN: Integer-only Recurrent Neural Network},
  author = {Eyyüb Sari and Vanessa Courville and Vahid Partovi Nia},
  journal= {arXiv preprint arXiv:2109.09828},
  year   = {2022}
}

iRNN: Integer-only Recurrent Neural Network

Abstract

Keywords

Cite

Related papers