Related papers: Efficient Speech Representation Learning with Low-…

Towards One-bit ASR: Extremely Low-bit Conformer Quantization Using Co-training and Stochastic Precision

Model compression has become an emerging need as the sizes of modern speech systems rapidly increase. In this paper, we study model weight quantization, which directly reduces the memory footprint to accommodate computationally…

Sound · Computer Science 2025-05-28 Zhaoqing Li , Haoning Xu , Zengrui Jin , Lingwei Meng , Tianzi Wang , Huimeng Wang , Youjun Chen , Mingyu Cui , Shujie Hu , Xunying Liu

2-bit Conformer quantization for automatic speech recognition

Large speech models are rapidly gaining traction in research community. As a result, model compression has become an important topic, so that these models can fit in memory and be served with reduced cost. Practical approaches for…

Audio and Speech Processing · Electrical Eng. & Systems 2023-05-29 Oleg Rybakov , Phoenix Meadowlark , Shaojin Ding , David Qiu , Jian Li , David Rim , Yanzhang He

4-bit Quantization of LSTM-based Speech Recognition Models

We investigate the impact of aggressive low-precision representations of weights and activations in two families of large LSTM-based architectures for Automatic Speech Recognition (ASR): hybrid Deep Bidirectional LSTM - Hidden Markov Models…

Computation and Language · Computer Science 2021-08-30 Andrea Fasoli , Chia-Yu Chen , Mauricio Serrano , Xiao Sun , Naigang Wang , Swagath Venkataramani , George Saon , Xiaodong Cui , Brian Kingsbury , Wei Zhang , Zoltán Tüske , Kailash Gopalakrishnan

4-bit Conformer with Native Quantization Aware Training for Speech Recognition

Reducing the latency and model size has always been a significant research problem for live Automatic Speech Recognition (ASR) application scenarios. Along this direction, model quantization has become an increasingly popular approach to…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-06 Shaojin Ding , Phoenix Meadowlark , Yanzhang He , Lukasz Lew , Shivani Agrawal , Oleg Rybakov

Edge-ASR: Towards Low-Bit Quantization of Automatic Speech Recognition Models

Recent advances in Automatic Speech Recognition (ASR) have demonstrated remarkable accuracy and robustness in diverse audio applications, such as live transcription and voice command processing. However, deploying these models on…

Sound · Computer Science 2025-08-05 Chen Feng , Yicheng Lin , Shaojie Zhuo , Chenzheng Su , Ramchalam Kinattinkara Ramakrishnan , Zhaocong Yuan , Xiaopeng Zhang

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

End-to-end automatic speech recognition (ASR) models have seen revolutionary quality gains with the recent development of large-scale universal speech models (USM). However, deploying these massive USMs is extremely expensive due to the…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-17 Shaojin Ding , David Qiu , David Rim , Yanzhang He , Oleg Rybakov , Bo Li , Rohit Prabhavalkar , Weiran Wang , Tara N. Sainath , Zhonglin Han , Jian Li , Amir Yazdanbakhsh , Shivani Agrawal

On the efficient representation and execution of deep acoustic models

In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values. The proposed…

Machine Learning · Computer Science 2016-12-20 Raziel Alvarez , Rohit Prabhavalkar , Anton Bakhtin

An Effective Training Framework for Light-Weight Automatic Speech Recognition Models

Recent advancement in deep learning encouraged developing large automatic speech recognition (ASR) models that achieve promising results while ignoring computational and memory constraints. However, deploying such models on low resource…

Computer Vision and Pattern Recognition · Computer Science 2025-05-29 Abdul Hannan , Alessio Brutti , Shah Nawaz , Mubashir Noman

Quantizing Whisper-small: How design choices affect ASR performance

Large speech recognition models like Whisper-small achieve high accuracy but are difficult to deploy on edge devices due to their high computational demand. To this end, we present a unified, cross-library evaluation of post-training…

Audio and Speech Processing · Electrical Eng. & Systems 2026-05-22 Arthur Söhler , Julian Irigoyen , Andreas Søeborg Kirkedal

QuantSR+: Pushing the Limit of Quantized Image Super-Resolution Networks

Low-bit quantization is widely used to compress super-resolution (SR) models and reduce storage and computation costs for deployment on resource-limited devices. However, when SR models are pushed to ultra-low precision (2-4 bits),…

Computer Vision and Pattern Recognition · Computer Science 2026-05-22 Haotong Qin , Xudong Ma , Xianglong Liu , Jie Luo , Jinyang Guo , Michele Magno , Yulun Zhang

Sub-8-bit quantization for on-device speech recognition: a regularization-free approach

For on-device automatic speech recognition (ASR), quantization aware training (QAT) is ubiquitous to achieve the trade-off between model predictive performance and efficiency. Among existing QAT methods, one major drawback is that the…

Sound · Computer Science 2022-11-02 Kai Zhen , Martin Radfar , Hieu Duy Nguyen , Grant P. Strimel , Nathan Susanj , Athanasios Mouchtaris

Enhancing Quantised End-to-End ASR Models via Personalisation

Recent end-to-end automatic speech recognition (ASR) models have become increasingly larger, making them particularly challenging to be deployed on resource-constrained devices. Model quantisation is an effective solution that sometimes…

Sound · Computer Science 2023-09-19 Qiuming Zhao , Guangzhi Sun , Chao Zhang , Mingxing Xu , Thomas Fang Zheng

Effective and Efficient Mixed Precision Quantization of Speech Foundation Models

This paper presents a novel mixed-precision quantization approach for speech foundation models that tightly integrates mixed-precision learning and quantized model parameter estimation into one single model compression stage. Experiments…

Sound · Computer Science 2025-01-14 Haoning Xu , Zhaoqing Li , Zengrui Jin , Huimeng Wang , Youjun Chen , Guinan Li , Mengzhe Geng , Shujie Hu , Jiajun Deng , Xunying Liu

Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition

Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing. The SSL model is normally pre-trained on a great variety of unlabelled data and a large model size is preferred to increase the modeling…

Audio and Speech Processing · Electrical Eng. & Systems 2025-05-08 Yujin Wang , Changli Tang , Ziyang Ma , Zhisheng Zheng , Xie Chen , Wei-Qiang Zhang

Quantization for OpenAI's Whisper Models: A Comparative Analysis

Automated speech recognition (ASR) models have gained prominence for applications such as captioning, speech translation, and live transcription. This paper studies Whisper and two model variants: one optimized for live speech streaming and…

Sound · Computer Science 2025-03-14 Allison Andreyev

OneBit: Towards Extremely Low-bit Large Language Models

Model quantification uses low bit-width values to represent the weight matrices of existing models to be quantized, which is a promising approach to reduce both storage and computational overheads of deploying highly anticipated LLMs.…

Computation and Language · Computer Science 2024-12-02 Yuzhuang Xu , Xu Han , Zonghan Yang , Shuo Wang , Qingfu Zhu , Zhiyuan Liu , Weidong Liu , Wanxiang Che

Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus

State of the art time automatic speech recognition (ASR) systems are becoming increasingly complex and expensive for practical applications. This paper presents the development of a high performance and low-footprint 4-bit quantized LF-MMI…

Sound · Computer Science 2022-06-24 Junhao Xu , Shoukang Hu , Xunying Liu , Helen Meng

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning

Large-scale speech self-supervised learning (SSL) has emerged to the main field of speech processing, however, the problem of computational cost arising from its vast size makes a high entry barrier to academia. In addition, existing…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-04 Yeonghyeon Lee , Kangwook Jang , Jahyun Goo , Youngmoon Jung , Hoirin Kim

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

Self-supervised speech representation learning has shown promising results in various speech processing tasks. However, the pre-trained models, e.g., HuBERT, are storage-intensive Transformers, limiting their scope of applications under…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-22 Rui Wang , Qibing Bai , Junyi Ao , Long Zhou , Zhixiang Xiong , Zhihua Wei , Yu Zhang , Tom Ko , Haizhou Li

DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT

Self-supervised speech representation learning methods like wav2vec 2.0 and Hidden-unit BERT (HuBERT) leverage unlabeled speech data for pre-training and offer good representations for numerous speech processing tasks. Despite the success…

Computation and Language · Computer Science 2022-04-29 Heng-Jui Chang , Shu-wen Yang , Hung-yi Lee