English
Related papers

Related papers: Generating Human Readable Transcript for Automatic…

200 papers

Modern Automatic Speech Recognition (ASR) systems can achieve high performance in terms of recognition accuracy. However, a perfectly accurate transcript still can be challenging to read due to grammatical errors, disfluency, and other…

Computation and Language · Computer Science 2020-04-10 Junwei Liao , Sefik Emre Eskimez , Liyang Lu , Yu Shi , Ming Gong , Linjun Shou , Hong Qu , Michael Zeng

In this work, we introduce a simple yet efficient post-processing model for automatic speech recognition (ASR). Our model has Transformer-based encoder-decoder architecture which "translates" ASR model output into grammatically and…

Computation and Language · Computer Science 2019-10-24 Oleksii Hrinchuk , Mariya Popova , Boris Ginsburg

Although modern automatic speech recognition (ASR) systems can achieve high performance, they may produce errors that weaken readers' experience and do harm to downstream tasks. To improve the accuracy and reliability of ASR hypotheses, we…

Audio and Speech Processing · Electrical Eng. & Systems 2022-01-11 Jing Du , Shiliang Pu , Qinbo Dong , Chao Jin , Xin Qi , Dian Gu , Ru Wu , Hongwei Zhou

In a pipeline speech translation system, automatic speech recognition (ASR) system will transmit errors in recognition to the downstream machine translation (MT) system. A standard machine translation system is usually trained on parallel…

Computation and Language · Computer Science 2019-10-29 Qiao Cheng , Meiyuan Fang , Yaqian Han , Jin Huang , Yitao Duan

Speech enhancement (SE) systems are typically evaluated using a variety of instrumental metrics. The use of automatic speech recognition (ASR) systems to evaluate SE performance is common in literature, usually in terms of word error rate…

Audio and Speech Processing · Electrical Eng. & Systems 2026-05-13 Danilo de Oliveira , Tal Peer , Timo Gerkmann

Aiming at reducing the reliance on expensive human annotations, data synthesis for Automatic Speech Recognition (ASR) has remained an active area of research. While prior work mainly focuses on synthetic speech generation for ASR data…

With the surge of online meetings, it has become more critical than ever to provide high-quality speech audio and live captioning under various noise conditions. However, most monaural speech enhancement (SE) models introduce processing…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-08 Sefik Emre Eskimez , Xiaofei Wang , Min Tang , Hemin Yang , Zirun Zhu , Zhuo Chen , Huaming Wang , Takuya Yoshioka

Form about four decades human beings have been dreaming of an intelligent machine which can master the natural speech. In its simplest form, this machine should consist of two subsystems, namely automatic speech recognition (ASR) and speech…

Sound · Computer Science 2013-05-08 Urmila Shrawankar , V. M. Thakare

Automatic Speech Recognition (ASR) systems are known to exhibit difficulties when transcribing children's speech. This can mainly be attributed to the absence of large children's speech corpora to train robust ASR models and the resulting…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-22 Jenthe Thienpondt , Kris Demuynck

Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time.…

Computation and Language · Computer Science 2020-05-26 Danni Liu , Jan Niehues , Gerasimos Spanakis

Recent years have witnessed significant improvement in ASR systems to recognize spoken utterances. However, it is still a challenging task for noisy and out-of-domain data, where substitution and deletion errors are prevalent in the…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-17 Mukuntha Narayanan Sundararaman , Ayush Kumar , Jithendra Vepa

Automatic speech recognition (ASR) outcomes serve as input for downstream tasks, substantially impacting the satisfaction level of end-users. Hence, the diagnosis and enhancement of the vulnerabilities present in the ASR model bear…

Computation and Language · Computer Science 2024-01-29 Seonmin Koo , Chanjun Park , Jinsung Kim , Jaehyung Seo , Sugyeong Eo , Hyeonseok Moon , Heuiseok Lim

As human-machine voice interfaces provide easy access to increasingly intelligent machines, many state-of-the-art automatic speech recognition (ASR) systems are proposed. However, commercial ASR systems usually have poor performance on…

Computation and Language · Computer Science 2023-09-28 Yanan Jia

Recent advances in supervised, semi-supervised and self-supervised deep learning algorithms have shown significant improvement in the performance of automatic speech recognition(ASR) systems. The state-of-the-art systems have achieved a…

Computation and Language · Computer Science 2021-10-19 Somnath Roy

Automatic speech recognition (ASR) systems often make unrecoverable errors due to subsystem pruning (acoustic, language and pronunciation models); for example pruning words due to acoustics using short-term context, prior to rescoring with…

Computation and Language · Computer Science 2019-07-01 Prashanth Gurunath Shivakumar , Haoqi Li , Kevin Knight , Panayiotis Georgiou

Speech-enabled systems typically first convert audio to text through an automatic speech recognition (ASR) model and then feed the text to downstream natural language processing (NLP) modules. The errors of the ASR system can seriously…

Computation and Language · Computer Science 2021-03-26 Tong Cui , Jinghui Xiao , Liangyou Li , Xin Jiang , Qun Liu

Automatic speech recognition (ASR) models rely on high-quality transcribed data for effective training. Generating pseudo-labels for large unlabeled audio datasets often relies on complex pipelines that combine multiple ASR outputs through…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-06 Jeena Prakash , Blessingh Kumar , Kadri Hacioglu , Bidisha Sharma , Sindhuja Gopalan , Malolan Chetlur , Shankar Venkatesan , Andreas Stolcke

Supervised training of speech recognition models requires access to transcribed audio data, which often is not possible due to confidentiality issues. Our approach to this problem is to generate synthetic audio from a text-only corpus using…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-01 Yanis Perrin , Gilles Boulianne

Automatic speech recognition (ASR) systems generate real-time transcriptions but often miss nuances that human interpreters capture. While ASR is useful in many contexts, interpreters-who already use ASR tools such as Dragon-add critical…

Sound · Computer Science 2025-10-15 Carlos Arriaga , Alejandro Pozo , Javier Conde , Alvaro Alonso

Recent advancement in deep learning encouraged developing large automatic speech recognition (ASR) models that achieve promising results while ignoring computational and memory constraints. However, deploying such models on low resource…

Computer Vision and Pattern Recognition · Computer Science 2025-05-29 Abdul Hannan , Alessio Brutti , Shah Nawaz , Mubashir Noman
‹ Prev 1 2 3 10 Next ›