English
Related papers

Related papers: Prioritizing Speech Test Cases

200 papers

Speaker-attributed automatic speech recognition (SA-ASR) improves the accuracy and applicability of multi-speaker ASR systems in real-world scenarios by assigning speaker labels to transcribed texts. However, SA-ASR poses unique challenges…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-29 Xiang Lyu , Yuhang Cao , Qing Wang , Jingjing Yin , Yuguang Yang , Pengpeng Zou , Yanni Hu , Heng Lu

We consider the task of personalizing ASR models while being constrained by a fixed budget on recording speaker-specific utterances. Given a speaker and an ASR model, we propose a method of identifying sentences for which the speaker's…

Sound · Computer Science 2021-06-03 Abhijeet Awasthi , Aman Kansal , Sunita Sarawagi , Preethi Jyothi

Developers need to perform adequate testing to ensure the quality of Automatic Speech Recognition (ASR) systems. However, manually collecting required test cases is tedious and time-consuming. Our recent work proposes CrossASR, a…

Software Engineering · Computer Science 2022-01-06 Muhammad Hilmi Asyrofi , Zhou Yang , David Lo

Recently, self-supervised pre-training has gained success in automatic speech recognition (ASR). However, considering the difference between speech accents in real scenarios, how to identify accents and use accent features to improve ASR is…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-16 Keqi Deng , Songjun Cao , Long Ma

This paper describes our RoyalFlush system for the track of multi-speaker automatic speech recognition (ASR) in the M2MeT challenge. We adopted the serialized output training (SOT) based multi-speakers ASR system with large-scale simulation…

Sound · Computer Science 2022-02-25 Shuaishuai Ye , Peiyao Wang , Shunfei Chen , Xinhui Hu , Xinkang Xu

Identifying mistakes (i.e., miscues) made while reading aloud is commonly approached post-hoc by comparing automatic speech recognition (ASR) transcriptions to the target reading text. However, post-hoc methods perform poorly when ASR…

Machine Learning · Computer Science 2025-05-30 Griffin Dietz Smith , Dianna Yee , Jennifer King Chen , Leah Findlater

Recent years have witnessed significant improvement in ASR systems to recognize spoken utterances. However, it is still a challenging task for noisy and out-of-domain data, where substitution and deletion errors are prevalent in the…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-17 Mukuntha Narayanan Sundararaman , Ayush Kumar , Jithendra Vepa

With the development of deep learning, automatic speech recognition (ASR) has made significant progress. To further enhance the performance of ASR, revising recognition results is one of the lightweight but efficient manners. Various…

Computation and Language · Computer Science 2024-06-14 Yi-Wei Wang , Ke-Han Lu , Kuan-Yu Chen

Streaming Automatic Speech Recognition (ASR) in voice assistants can utilize prefetching to partially hide the latency of response generation. Prefetching involves passing a preliminary ASR hypothesis to downstream systems in order to…

Computation and Language · Computer Science 2023-05-24 Andreas Schwarz , Di He , Maarten Van Segbroeck , Mohammed Hethnawi , Ariya Rastrow

Modeling the errors of a speech recognizer can help simulate errorful recognized speech data from plain text, which has proven useful for tasks like discriminative language modeling, improving robustness of NLP systems, where limited or…

Artificial Intelligence · Computer Science 2024-08-22 Prashant Serai , Peidong Wang , Eric Fosler-Lussier

Automatic speech recognition (ASR) has the potential to substantially reduce manual annotation effort in child speech research by generating automatic transcriptions. However, obtaining reliably high-quality ASR transcriptions for child…

Computation and Language · Computer Science 2026-05-29 Gus Lathouwers , Lingyun Gao , Catia Cucchiarini , Helmer Strik

At the present time, computers are employed to solve complex tasks and problems ranging from simple calculations to intensive digital image processing and intricate algorithmic optimization problems to computationally-demanding weather…

Computation and Language · Computer Science 2012-03-26 Youssef Bassil , Paul Semaan

This paper presents a speech intelligibility model based on automatic speech recognition (ASR), combining phoneme probabilities from deep neural networks (DNN) and a performance measure that estimates the word error rate from these…

Automatic speech recognition (ASR) outcomes serve as input for downstream tasks, substantially impacting the satisfaction level of end-users. Hence, the diagnosis and enhancement of the vulnerabilities present in the ASR model bear…

Computation and Language · Computer Science 2024-01-29 Seonmin Koo , Chanjun Park , Jinsung Kim , Jaehyung Seo , Sugyeong Eo , Hyeonseok Moon , Heuiseok Lim

Recently, masked prediction pre-training has seen remarkable progress in self-supervised learning (SSL) for speech recognition. It usually requires a codebook obtained in an unsupervised way, making it less accurate and difficult to…

Computation and Language · Computer Science 2022-06-22 Chengyi Wang , Yiming Wang , Yu Wu , Sanyuan Chen , Jinyu Li , Shujie Liu , Furu Wei

Automatic classification of disordered speech can provide an objective tool for identifying the presence and severity of speech impairment. Classification approaches can also help identify hard-to-recognize speech samples to teach ASR…

Audio and Speech Processing · Electrical Eng. & Systems 2021-07-09 Subhashini Venugopalan , Joel Shor , Manoj Plakal , Jimmy Tobin , Katrin Tomanek , Jordan R. Green , Michael P. Brenner

Streaming automatic speech recognition (ASR) aims to emit each hypothesized word as quickly and accurately as possible. However, emitting fast without degrading quality, as measured by word error rate (WER), is highly challenging. Existing…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-05 Jiahui Yu , Chung-Cheng Chiu , Bo Li , Shuo-yiin Chang , Tara N. Sainath , Yanzhang He , Arun Narayanan , Wei Han , Anmol Gulati , Yonghui Wu , Ruoming Pang

Automatic speech recognition (ASR) systems often make unrecoverable errors due to subsystem pruning (acoustic, language and pronunciation models); for example pruning words due to acoustics using short-term context, prior to rescoring with…

Computation and Language · Computer Science 2019-07-01 Prashanth Gurunath Shivakumar , Haoqi Li , Kevin Knight , Panayiotis Georgiou

Speech-enabled systems typically first convert audio to text through an automatic speech recognition (ASR) model and then feed the text to downstream natural language processing (NLP) modules. The errors of the ASR system can seriously…

Computation and Language · Computer Science 2021-03-26 Tong Cui , Jinghui Xiao , Liangyou Li , Xin Jiang , Qun Liu

Automatic speech recognition (ASR) for African languages remains constrained by limited labeled data and the lack of systematic guidance on model selection, data scaling, and decoding strategies. Large pre-trained systems such as Whisper,…

‹ Prev 1 2 3 10 Next ›