Related papers: UCorrect: An Unsupervised Framework for Automatic …

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

Error correction techniques have been used to refine the output sentences from automatic speech recognition (ASR) models and achieve a lower word error rate (WER) than original ASR outputs. Previous works usually use a sequence-to-sequence…

Computation and Language · Computer Science 2022-11-30 Yichong Leng , Xu Tan , Linchen Zhu , Jin Xu , Renqian Luo , Linquan Liu , Tao Qin , Xiang-Yang Li , Ed Lin , Tie-Yan Liu

FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition

Error correction is widely used in automatic speech recognition (ASR) to post-process the generated sentence, and can further reduce the word error rate (WER). Although multiple candidates are generated by an ASR system through beam search,…

Computation and Language · Computer Science 2022-11-30 Yichong Leng , Xu Tan , Rui Wang , Linchen Zhu , Jin Xu , Wenjie Liu , Linquan Liu , Tao Qin , Xiang-Yang Li , Edward Lin , Tie-Yan Liu

PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction

Speech-to-text errors made by automatic speech recognition (ASR) systems negatively impact downstream models. Error correction models as a post-processing text editing method have been recently developed for refining the ASR outputs.…

Computation and Language · Computer Science 2023-06-22 Ziji Zhang , Zhehui Wang , Rajesh Kamma , Sharanya Eswaran , Narayanan Sadagopan

Unsupervised domain adaptation for speech recognition with unsupervised error correction

The transcription quality of automatic speech recognition (ASR) systems degrades significantly when transcribing audios coming from unseen domains. We propose an unsupervised error correction method for unsupervised ASR domain adaption,…

Sound · Computer Science 2022-09-27 Long Mai , Julie Carson-Berndsen

SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition

Error correction in automatic speech recognition (ASR) aims to correct those incorrect words in sentences generated by ASR models. Since recent ASR models usually have low word error rate (WER), to avoid affecting originally correct tokens,…

Computation and Language · Computer Science 2023-12-21 Yichong Leng , Xu Tan , Wenjie Liu , Kaitao Song , Rui Wang , Xiang-Yang Li , Tao Qin , Edward Lin , Tie-Yan Liu

A Unified Speaker Adaptation Approach for ASR

Transformer models have been used in automatic speech recognition (ASR) successfully and yields state-of-the-art results. However, its performance is still affected by speaker mismatch between training and test data. Further finetuning a…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-19 Yingzhu Zhao , Chongjia Ni , Cheung-Chi Leung , Shafiq Joty , Eng Siong Chng , Bin Ma

Error Correction in ASR using Sequence-to-Sequence Models

Post-editing in Automatic Speech Recognition (ASR) entails automatically correcting common and systematic errors produced by the ASR system. The outputs of an ASR system are largely prone to phonetic and spelling errors. In this paper, we…

Computation and Language · Computer Science 2022-08-24 Samrat Dutta , Shreyansh Jain , Ayush Maheshwari , Souvik Pal , Ganesh Ramakrishnan , Preethi Jyothi

Unsupervised Language agnostic WER Standardization

Word error rate (WER) is a standard metric for the evaluation of Automated Speech Recognition (ASR) systems. However, WER fails to provide a fair evaluation of human perceived quality in presence of spelling variations, abbreviations, or…

Computation and Language · Computer Science 2023-03-10 Satarupa Guha , Rahul Ambavat , Ankur Gupta , Manish Gupta , Rupeshkumar Mehta

Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition

Accurately finding the wrong words in the automatic speech recognition (ASR) hypothesis and recovering them well-founded is the goal of speech error correction. In this paper, we propose a non-autoregressive speech error correction method.…

Computation and Language · Computer Science 2024-07-19 Yuchun Shu , Bo Hu , Yifeng He , Hao Shi , Longbiao Wang , Jianwu Dang

Automatic Speech Recognition System-Independent Word Error Rate Estimation

Word error rate (WER) is a metric used to evaluate the quality of transcriptions produced by Automatic Speech Recognition (ASR) systems. In many applications, it is of interest to estimate WER given a pair of a speech utterance and a…

Computation and Language · Computer Science 2024-04-29 Chanho Park , Mingjie Chen , Thomas Hain

Hybrid phonetic-neural model for correction in speech recognition systems

Automatic speech recognition (ASR) is a relevant area in multiple settings because it provides a natural communication mechanism between applications and users. ASRs often fail in environments that use language specific to particular…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-16 Rafael Viana-Cámara , Mario Campos-Soberanis , Diego Campos-Sobrino

Training Neural Speech Recognition Systems with Synthetic Speech Augmentation

Building an accurate automatic speech recognition (ASR) system requires a large dataset that contains many hours of labeled speech samples produced by a diverse set of speakers. The lack of such open free datasets is one of the main issues…

Computation and Language · Computer Science 2018-11-05 Jason Li , Ravi Gadde , Boris Ginsburg , Vitaly Lavrukhin

ASR Error Correction with Constrained Decoding on Operation Prediction

Error correction techniques remain effective to refine outputs from automatic speech recognition (ASR) models. Existing end-to-end error correction methods based on an encoder-decoder architecture process all tokens in the decoding phase,…

Computation and Language · Computer Science 2022-08-10 Jingyuan Yang , Rongjun Li , Wei Peng

A Reference-less Quality Metric for Automatic Speech Recognition via Contrastive-Learning of a Multi-Language Model with Self-Supervision

The common standard for quality evaluation of automatic speech recognition (ASR) systems is reference-based metrics such as the Word Error Rate (WER), computed using manual ground-truth transcriptions that are time-consuming and expensive…

Computation and Language · Computer Science 2023-06-26 Kamer Ali Yuksel , Thiago Ferreira , Ahmet Gunduz , Mohamed Al-Badrashiny , Golara Javadi

On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the…

Sound · Computer Science 2021-04-21 Shahin Amiriparian , Artem Sokolov , Ilhan Aslan , Lukas Christ , Maurice Gerczuk , Tobias Hübner , Dmitry Lamanov , Manuel Milling , Sandra Ottl , Ilya Poduremennykh , Evgeniy Shuranov , Björn W. Schuller

DiaCorrect: End-to-end error correction for speaker diarization

In recent years, speaker diarization has attracted widespread attention. To achieve better performance, some studies propose to diarize speech in multiple stages. Although these methods might bring additional benefits, most of them are…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-19 Jiangyu Han , Yuhang Cao , Heng Lu , Yanhua Long

A spelling correction model for end-to-end speech recognition

Attention-based sequence-to-sequence models for speech recognition jointly train an acoustic model, language model (LM), and alignment mechanism using a single neural network and require only parallel audio-text pairs. Thus, the language…

Audio and Speech Processing · Electrical Eng. & Systems 2019-02-20 Jinxi Guo , Tara N. Sainath , Ron J. Weiss

Error-driven Fixed-Budget ASR Personalization for Accented Speakers

We consider the task of personalizing ASR models while being constrained by a fixed budget on recording speaker-specific utterances. Given a speaker and an ASR model, we propose a method of identifying sentences for which the speaker's…

Sound · Computer Science 2021-06-03 Abhijeet Awasthi , Aman Kansal , Sunita Sarawagi , Preethi Jyothi

Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

The performances of automatic speech recognition (ASR) systems are usually evaluated by the metric word error rate (WER) when the manually transcribed data are provided, which are, however, expensively available in the real scenario. In…

Computation and Language · Computer Science 2020-09-01 Kai Fan , Jiayi Wang , Bo Li , Shiliang Zhang , Boxing Chen , Niyu Ge , Zhijie Yan

Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation

Generative Error Correction (GEC) has emerged as a powerful post-processing method to enhance the performance of Automatic Speech Recognition (ASR) systems. However, we show that GEC models struggle to generalize beyond the specific types…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-18 Sreyan Ghosh , Mohammad Sadegh Rasooli , Michael Levit , Peidong Wang , Jian Xue , Dinesh Manocha , Jinyu Li