English
Related papers

Related papers: Robust ASR Error Correction with Conservative Data…

200 papers

Error correction (EC) models play a crucial role in refining Automatic Speech Recognition (ASR) transcriptions, enhancing the readability and quality of transcriptions. Without requiring access to the underlying code or model weights, EC…

Computation and Language · Computer Science 2025-01-22 Rao Ma , Mengjie Qian , Mark Gales , Kate Knill

We previously proposed contextual spelling correction (CSC) to correct the output of end-to-end (E2E) automatic speech recognition (ASR) models with contextual information such as name, place, etc. Although CSC has achieved reasonable…

Sound · Computer Science 2023-02-23 Xiaoqiang Wang , Yanqing Liu , Jinyu Li , Sheng Zhao

Code-switching (CS) speech refers to the phenomenon of mixing two or more languages within the same sentence. Despite the recent advances in automatic speech recognition (ASR), CS-ASR is still a challenging task ought to the grammatical…

Computation and Language · Computer Science 2023-10-23 Chen Chen , Yuchen Hu , Chao-Han Huck Yang , Hexin Liu , Sabato Marco Siniscalchi , Eng Siong Chng

Attention-based sequence-to-sequence models for speech recognition jointly train an acoustic model, language model (LM), and alignment mechanism using a single neural network and require only parallel audio-text pairs. Thus, the language…

Audio and Speech Processing · Electrical Eng. & Systems 2019-02-20 Jinxi Guo , Tara N. Sainath , Ron J. Weiss

The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical…

Computation and Language · Computer Science 2023-01-12 Amir Hussein , Shammur Absar Chowdhury , Ahmed Abdelali , Najim Dehak , Ahmed Ali , Sanjeev Khudanpur

Automatic speech Recognition (ASR) is a fundamental and important task in the field of speech and natural language processing. It is an inherent building block in many applications such as voice assistant, speech translation, etc. Despite…

Computation and Language · Computer Science 2024-12-05 Victor Junqiu Wei , Weicheng Wang , Di Jiang , Yuanfeng Song , Lu Wang

Modern ASR systems are typically trained on large-scale pseudo-labeled, in-the-wild data spanning multiple domains. While such heterogeneous data benefit generalist models designed for broad deployment, they pose challenges for specialist…

Aiming to improve the Automatic Speech Recognition (ASR) outputs with a post-processing step, ASR error correction (EC) techniques have been widely developed due to their efficiency in using parallel text data. Previous works mainly focus…

Audio and Speech Processing · Electrical Eng. & Systems 2023-05-29 Vanya Bannihatti Kumar , Shanbo Cheng , Ningxin Peng , Yuchen Zhang

Accurately finding the wrong words in the automatic speech recognition (ASR) hypothesis and recovering them well-founded is the goal of speech error correction. In this paper, we propose a non-autoregressive speech error correction method.…

Computation and Language · Computer Science 2024-07-19 Yuchun Shu , Bo Hu , Yifeng He , Hao Shi , Longbiao Wang , Jianwu Dang

In this work, we present the first study addressing automatic speech recognition (ASR) for children in an online learning setting. This is particularly important for both child-centric applications and the privacy protection of minors,…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-07 Edem Ahadzi , Vishwanath Pratap Singh , Tomi Kinnunen , Ville Hautamaki

Contextual ASR or hotword customization holds substantial practical value. Despite the impressive performance of current end-to-end (E2E) automatic speech recognition (ASR) systems, they often face challenges in accurately recognizing rare…

Audio and Speech Processing · Electrical Eng. & Systems 2024-11-12 Guanrou Yang , Ziyang Ma , Zhifu Gao , Shiliang Zhang , Xie Chen

This paper presents a method to train end-to-end automatic speech recognition (ASR) models using unpaired data. Although the end-to-end approach can eliminate the need for expert knowledge such as pronunciation dictionaries to build ASR…

Computation and Language · Computer Science 2019-05-24 Takaaki Hori , Ramon Astudillo , Tomoki Hayashi , Yu Zhang , Shinji Watanabe , Jonathan Le Roux

Error correction techniques have been used to refine the output sentences from automatic speech recognition (ASR) models and achieve a lower word error rate (WER). Previous works usually adopt end-to-end models and has strong dependency on…

Computation and Language · Computer Science 2024-01-12 Jiaxin Guo , Minghan Wang , Xiaosong Qiao , Daimeng Wei , Hengchao Shang , Zongyao Li , Zhengzhe Yu , Yinglu Li , Chang Su , Min Zhang , Shimin Tao , Hao Yang

Code-switching (CS) refers to the switching of languages within a speech signal and results in language confusion for automatic speech recognition (ASR). To address language confusion, we propose a language alignment loss (LAL) that aligns…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-04 Hexin Liu , Xiangyu Zhang , Haoyang Zhang , Leibny Paola Garcia , Andy W. H. Khong , Eng Siong Chng , Shinji Watanabe

This paper proposes an adaptation method for end-to-end speech recognition. In this method, multiple automatic speech recognition (ASR) 1-best hypotheses are integrated in the computation of the connectionist temporal classification (CTC)…

Computation and Language · Computer Science 2021-04-01 Cong-Thanh Do , Rama Doddipatla , Thomas Hain

Effective communication in Air Traffic Control (ATC) is critical to maintaining aviation safety, yet the challenges posed by accented English remain largely unaddressed in Automatic Speech Recognition (ASR) systems. Existing models struggle…

Training automatic speech recognition (ASR) systems requires large amounts of well-curated paired data. However, human annotators usually perform "non-verbatim" transcription, which can result in poorly trained models. In this paper, we…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-28 Dongji Gao , Hainan Xu , Desh Raj , Leibny Paola Garcia Perera , Daniel Povey , Sanjeev Khudanpur

We consider the problem of recognizing speech utterances spoken to a device which is generating a known sound waveform; for example, recognizing queries issued to a digital assistant which is generating responses to previous user inputs.…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-03 Nathan Howard , Alex Park , Turaj Zakizadeh Shabestary , Alexander Gruenstein , Rohit Prabhavalkar

We present an approach to reduce the performance disparity between geographic regions without degrading performance on the overall user population for ASR. A popular approach is to fine-tune the model with data from regions where the ASR…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-09 Viet Anh Trinh , Pegah Ghahremani , Brian King , Jasha Droppo , Andreas Stolcke , Roland Maas

Improving the representation of contextual information is key to unlocking the potential of end-to-end (E2E) automatic speech recognition (ASR). In this work, we present a novel and simple approach for training an ASR context mechanism with…

Audio and Speech Processing · Electrical Eng. & Systems 2018-10-30 Uri Alon , Golan Pundak , Tara N. Sainath
‹ Prev 1 2 3 10 Next ›