English
Related papers

Related papers: Generative error correction for code-switching spe…

200 papers

Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which aims to predict the ground-truth transcription from the decoded N-best hypotheses. Thanks to the…

Computation and Language · Computer Science 2024-05-17 Yuchen Hu , Chen Chen , Chengwei Qin , Qiushi Zhu , Eng Siong Chng , Ruizhe Li

Code-switching (CS) refers to the switching of languages within a speech signal and results in language confusion for automatic speech recognition (ASR). To address language confusion, we propose a language alignment loss (LAL) that aligns…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-04 Hexin Liu , Xiangyu Zhang , Haoyang Zhang , Leibny Paola Garcia , Andy W. H. Khong , Eng Siong Chng , Shinji Watanabe

ASR error correction is an interesting option for post processing speech recognition system outputs. These error correction models are usually trained in a supervised fashion using the decoding results of a target ASR system. This approach…

Computation and Language · Computer Science 2023-10-02 Rao Ma , Mengjie Qian , Potsawee Manakul , Mark Gales , Kate Knill

With the strong representational power of large language models (LLMs), generative error correction (GER) for automatic speech recognition (ASR) aims to provide semantic and phonetic refinements to address ASR errors. This work explores how…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-14 Yuka Ko , Sheng Li , Chao-Han Huck Yang , Tatsuya Kawahara

Generative error correction (GER) with large language models (LLMs) has emerged as an effective post-processing approach to improve automatic speech recognition (ASR) performance. However, it often struggles with rare or domain-specific…

Sound · Computer Science 2025-05-26 Natsuo Yamashita , Masaaki Yamamoto , Hiroaki Kokubo , Yohei Kawaguchi

Code-switching (CS) phenomenon occurs when words or phrases from different languages are alternated in a single sentence. Due to data scarcity, building an effective CS Automatic Speech Recognition (ASR) system remains challenging. In this…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-23 Yu Xi , Wen Ding , Kai Yu , Junjie Lai

Given recent advances in generative AI technology, a key question is how large language models (LLMs) can enhance acoustic modeling tasks using text decoding results from a frozen, pretrained automatic speech recognition (ASR) model. To…

Despite notable advancements in automatic speech recognition (ASR), performance tends to degrade when faced with adverse conditions. Generative error correction (GER) leverages the exceptional text comprehension capabilities of large…

Audio and Speech Processing · Electrical Eng. & Systems 2024-05-08 Bingshen Mu , Yangze Li , Qijie Shao , Kun Wei , Xucheng Wan , Naijun Zheng , Huan Zhou , Lei Xie

Code-switching automatic speech recognition (CS-ASR) presents unique challenges due to language confusion introduced by spontaneous intra-sentence switching and accent bias that blurs the phonetic boundaries. Although the constituent…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-18 Hexin Liu , Haoyang Zhang , Qiquan Zhang , Xiangyu Zhang , Dongyuan Shi , Eng Siong Chng , Haizhou Li

Automatic speech recognition (ASR) has improved substantially in recent years, yet performance remains limited for low-resource languages. Large language models (LLMs) have shown promise for improving ASR through generative error correction…

Computation and Language · Computer Science 2026-05-20 Yun Hao , Reihaneh Amooie , Wietse de Vries , Rik van Noord , Martijn Wieling

With the rise of globalisation, code-switching (CSW) has become a ubiquitous part of multilingual conversation, posing new challenges for natural language processing (NLP), especially in Grammatical Error Correction (GEC). This work…

Computation and Language · Computer Science 2024-10-15 Tom Potter , Zheng Yuan

We introduce a new cross-modal fusion technique designed for generative error correction in automatic speech recognition (ASR). Our methodology leverages both acoustic information and external linguistic representations to generate accurate…

Error correction (EC) models play a crucial role in refining Automatic Speech Recognition (ASR) transcriptions, enhancing the readability and quality of transcriptions. Without requiring access to the underlying code or model weights, EC…

Computation and Language · Computer Science 2025-01-22 Rao Ma , Mengjie Qian , Mark Gales , Kate Knill

Language models play a central role in automatic speech recognition (ASR), yet most methods rely on text-only models unaware of ASR error patterns. Recently, large language models (LLMs) have been applied to ASR correction, but introduce…

Machine Learning · Computer Science 2026-03-18 Zijin Gu , Tatiana Likhomanenko , He Bai , Erik McDermott , Ronan Collobert , Navdeep Jaitly

Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which leverages the rich linguistic knowledge and powerful reasoning ability of LLMs to improve…

Computation and Language · Computer Science 2024-01-22 Yuchen Hu , Chen Chen , Chao-Han Huck Yang , Ruizhe Li , Chao Zhang , Pin-Yu Chen , EnSiong Chng

Despite extensions to speech inputs, effectively leveraging the rich knowledge and contextual understanding of large language models (LLMs) in automatic speech recognition (ASR) remains non-trivial, as the task primarily involves direct…

Computation and Language · Computer Science 2026-04-02 Keqi Deng , Ruchao Fan , Bo Ren , Yiming Wang , Jinyu Li

The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical…

Computation and Language · Computer Science 2023-01-12 Amir Hussein , Shammur Absar Chowdhury , Ahmed Abdelali , Najim Dehak , Ahmed Ali , Sanjeev Khudanpur

Code-Switching (CS) is referred to the phenomenon of alternately using words and phrases from different languages. While today's neural end-to-end (E2E) models deliver state-of-the-art performances on the task of automatic speech…

Computation and Language · Computer Science 2023-07-04 Enes Yavuz Ugan , Christian Huber , Juan Hussain , Alexander Waibel

Unlike traditional Automatic Speech Recognition (ASR), Audio-Visual Speech Recognition (AVSR) takes audio and visual signals simultaneously to infer the transcription. Recent studies have shown that Large Language Models (LLMs) can be…

Multimedia · Computer Science 2025-01-09 Rui Liu , Hongyu Yuan , Haizhou Li

Building upon the strength of modern large language models (LLMs), generative error correction (GEC) has emerged as a promising paradigm that can elevate the performance of modern automatic speech recognition (ASR) systems. One…

Computation and Language · Computer Science 2024-07-24 Rithik Sachdev , Zhong-Qiu Wang , Chao-Han Huck Yang
‹ Prev 1 2 3 10 Next ›