English
Related papers

Related papers: UCorrect: An Unsupervised Framework for Automatic …

200 papers

Error correction techniques have been used to refine the output sentences from automatic speech recognition (ASR) models and achieve a lower word error rate (WER) than original ASR outputs. Previous works usually use a sequence-to-sequence…

Computation and Language · Computer Science 2022-11-30 Yichong Leng , Xu Tan , Linchen Zhu , Jin Xu , Renqian Luo , Linquan Liu , Tao Qin , Xiang-Yang Li , Ed Lin , Tie-Yan Liu

Error correction is widely used in automatic speech recognition (ASR) to post-process the generated sentence, and can further reduce the word error rate (WER). Although multiple candidates are generated by an ASR system through beam search,…

Computation and Language · Computer Science 2022-11-30 Yichong Leng , Xu Tan , Rui Wang , Linchen Zhu , Jin Xu , Wenjie Liu , Linquan Liu , Tao Qin , Xiang-Yang Li , Edward Lin , Tie-Yan Liu

Speech-to-text errors made by automatic speech recognition (ASR) systems negatively impact downstream models. Error correction models as a post-processing text editing method have been recently developed for refining the ASR outputs.…

Computation and Language · Computer Science 2023-06-22 Ziji Zhang , Zhehui Wang , Rajesh Kamma , Sharanya Eswaran , Narayanan Sadagopan

The transcription quality of automatic speech recognition (ASR) systems degrades significantly when transcribing audios coming from unseen domains. We propose an unsupervised error correction method for unsupervised ASR domain adaption,…

Sound · Computer Science 2022-09-27 Long Mai , Julie Carson-Berndsen

Error correction in automatic speech recognition (ASR) aims to correct those incorrect words in sentences generated by ASR models. Since recent ASR models usually have low word error rate (WER), to avoid affecting originally correct tokens,…

Computation and Language · Computer Science 2023-12-21 Yichong Leng , Xu Tan , Wenjie Liu , Kaitao Song , Rui Wang , Xiang-Yang Li , Tao Qin , Edward Lin , Tie-Yan Liu

Transformer models have been used in automatic speech recognition (ASR) successfully and yields state-of-the-art results. However, its performance is still affected by speaker mismatch between training and test data. Further finetuning a…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-19 Yingzhu Zhao , Chongjia Ni , Cheung-Chi Leung , Shafiq Joty , Eng Siong Chng , Bin Ma

Post-editing in Automatic Speech Recognition (ASR) entails automatically correcting common and systematic errors produced by the ASR system. The outputs of an ASR system are largely prone to phonetic and spelling errors. In this paper, we…

Computation and Language · Computer Science 2022-08-24 Samrat Dutta , Shreyansh Jain , Ayush Maheshwari , Souvik Pal , Ganesh Ramakrishnan , Preethi Jyothi

Word error rate (WER) is a standard metric for the evaluation of Automated Speech Recognition (ASR) systems. However, WER fails to provide a fair evaluation of human perceived quality in presence of spelling variations, abbreviations, or…

Computation and Language · Computer Science 2023-03-10 Satarupa Guha , Rahul Ambavat , Ankur Gupta , Manish Gupta , Rupeshkumar Mehta

Accurately finding the wrong words in the automatic speech recognition (ASR) hypothesis and recovering them well-founded is the goal of speech error correction. In this paper, we propose a non-autoregressive speech error correction method.…

Computation and Language · Computer Science 2024-07-19 Yuchun Shu , Bo Hu , Yifeng He , Hao Shi , Longbiao Wang , Jianwu Dang

Word error rate (WER) is a metric used to evaluate the quality of transcriptions produced by Automatic Speech Recognition (ASR) systems. In many applications, it is of interest to estimate WER given a pair of a speech utterance and a…

Computation and Language · Computer Science 2024-04-29 Chanho Park , Mingjie Chen , Thomas Hain

Automatic speech recognition (ASR) is a relevant area in multiple settings because it provides a natural communication mechanism between applications and users. ASRs often fail in environments that use language specific to particular…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-16 Rafael Viana-Cámara , Mario Campos-Soberanis , Diego Campos-Sobrino

Building an accurate automatic speech recognition (ASR) system requires a large dataset that contains many hours of labeled speech samples produced by a diverse set of speakers. The lack of such open free datasets is one of the main issues…

Computation and Language · Computer Science 2018-11-05 Jason Li , Ravi Gadde , Boris Ginsburg , Vitaly Lavrukhin

Error correction techniques remain effective to refine outputs from automatic speech recognition (ASR) models. Existing end-to-end error correction methods based on an encoder-decoder architecture process all tokens in the decoding phase,…

Computation and Language · Computer Science 2022-08-10 Jingyuan Yang , Rongjun Li , Wei Peng

The common standard for quality evaluation of automatic speech recognition (ASR) systems is reference-based metrics such as the Word Error Rate (WER), computed using manual ground-truth transcriptions that are time-consuming and expensive…

Computation and Language · Computer Science 2023-06-26 Kamer Ali Yuksel , Thiago Ferreira , Ahmet Gunduz , Mohamed Al-Badrashiny , Golara Javadi

Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the…

In recent years, speaker diarization has attracted widespread attention. To achieve better performance, some studies propose to diarize speech in multiple stages. Although these methods might bring additional benefits, most of them are…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-19 Jiangyu Han , Yuhang Cao , Heng Lu , Yanhua Long

Attention-based sequence-to-sequence models for speech recognition jointly train an acoustic model, language model (LM), and alignment mechanism using a single neural network and require only parallel audio-text pairs. Thus, the language…

Audio and Speech Processing · Electrical Eng. & Systems 2019-02-20 Jinxi Guo , Tara N. Sainath , Ron J. Weiss

We consider the task of personalizing ASR models while being constrained by a fixed budget on recording speaker-specific utterances. Given a speaker and an ASR model, we propose a method of identifying sentences for which the speaker's…

Sound · Computer Science 2021-06-03 Abhijeet Awasthi , Aman Kansal , Sunita Sarawagi , Preethi Jyothi

The performances of automatic speech recognition (ASR) systems are usually evaluated by the metric word error rate (WER) when the manually transcribed data are provided, which are, however, expensively available in the real scenario. In…

Computation and Language · Computer Science 2020-09-01 Kai Fan , Jiayi Wang , Bo Li , Shiliang Zhang , Boxing Chen , Niyu Ge , Zhijie Yan

Generative Error Correction (GEC) has emerged as a powerful post-processing method to enhance the performance of Automatic Speech Recognition (ASR) systems. However, we show that GEC models struggle to generalize beyond the specific types…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-18 Sreyan Ghosh , Mohammad Sadegh Rasooli , Michael Levit , Peidong Wang , Jian Xue , Dinesh Manocha , Jinyu Li
‹ Prev 1 2 3 10 Next ›