English

Efficient Speech Representation Learning with Low-Bit Quantization

Audio and Speech Processing 2023-01-03 v1 Computation and Language

Abstract

With the development of hardware for machine learning, newer models often come at the cost of both increased sizes and computational complexity. In effort to improve the efficiency for these models, we apply and investigate recent quantization techniques on speech representation learning models. The quantization techniques were evaluated on the SUPERB benchmark. On the ASR task, with aggressive quantization to 1 bit, we achieved 86.32% storage reduction (184.42 -> 25.23), 88% estimated runtime reduction (1.00 -> 0.12) with increased word error rate (7.06 -> 15.96). In comparison with DistillHuBERT which also aims for model compression, the 2-bit configuration yielded slightly smaller storage (35.84 vs. 46.98), better word error rate (12.68 vs. 13.37) and more efficient estimated runtime (0.15 vs. 0.73).

Keywords

Cite

@article{arxiv.2301.00652,
  title  = {Efficient Speech Representation Learning with Low-Bit Quantization},
  author = {Ching-Feng Yeh and Wei-Ning Hsu and Paden Tomasello and Abdelrahman Mohamed},
  journal= {arXiv preprint arXiv:2301.00652},
  year   = {2023}
}

Comments

7 pages

R2 v1 2026-06-28T07:59:32.527Z