Efficient Speech Representation Learning with Low-Bit Quantization

Ching-Feng Yeh; Wei-Ning Hsu; Paden Tomasello; Abdelrahman Mohamed

Efficient Speech Representation Learning with Low-Bit Quantization

Audio and Speech Processing 2023-01-03 v1 Computation and Language

Authors: Ching-Feng Yeh , Wei-Ning Hsu , Paden Tomasello , Abdelrahman Mohamed

Abstract

With the development of hardware for machine learning, newer models often come at the cost of both increased sizes and computational complexity. In effort to improve the efficiency for these models, we apply and investigate recent quantization techniques on speech representation learning models. The quantization techniques were evaluated on the SUPERB benchmark. On the ASR task, with aggressive quantization to 1 bit, we achieved 86.32% storage reduction (184.42 -> 25.23), 88% estimated runtime reduction (1.00 -> 0.12) with increased word error rate (7.06 -> 15.96). In comparison with DistillHuBERT which also aims for model compression, the 2-bit configuration yielded slightly smaller storage (35.84 vs. 46.98), better word error rate (12.68 vs. 13.37) and more efficient estimated runtime (0.15 vs. 0.73).

Keywords

automatic speech recognition self-supervised speech learning

Cite

@article{arxiv.2301.00652,
  title  = {Efficient Speech Representation Learning with Low-Bit Quantization},
  author = {Ching-Feng Yeh and Wei-Ning Hsu and Paden Tomasello and Abdelrahman Mohamed},
  journal= {arXiv preprint arXiv:2301.00652},
  year   = {2023}
}

Comments

7 pages

Efficient Speech Representation Learning with Low-Bit Quantization

Abstract

Keywords

Cite

Comments

Related papers