Related papers: Generative error correction for code-switching spe…

Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models

Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which aims to predict the ground-truth transcription from the decoded N-best hypotheses. Thanks to the…

Computation and Language · Computer Science 2024-05-17 Yuchen Hu , Chen Chen , Chengwei Qin , Qiushi Zhu , Eng Siong Chng , Ruizhe Li

Aligning Speech to Languages to Enhance Code-switching Speech Recognition

Code-switching (CS) refers to the switching of languages within a speech signal and results in language confusion for automatic speech recognition (ASR). To address language confusion, we propose a language alignment loss (LAL) that aligns…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-04 Hexin Liu , Xiangyu Zhang , Haoyang Zhang , Leibny Paola Garcia , Andy W. H. Khong , Eng Siong Chng , Shinji Watanabe

Can Generative Large Language Models Perform ASR Error Correction?

ASR error correction is an interesting option for post processing speech recognition system outputs. These error correction models are usually trained in a supervised fashion using the decoding results of a target ASR system. This approach…

Computation and Language · Computer Science 2023-10-02 Rao Ma , Mengjie Qian , Potsawee Manakul , Mark Gales , Kate Knill

Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction

With the strong representational power of large language models (LLMs), generative error correction (GER) for automatic speech recognition (ASR) aims to provide semantic and phonetic refinements to address ASR errors. This work explores how…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-14 Yuka Ko , Sheng Li , Chao-Han Huck Yang , Tatsuya Kawahara

LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context

Generative error correction (GER) with large language models (LLMs) has emerged as an effective post-processing approach to improve automatic speech recognition (ASR) performance. However, it often struggles with rare or domain-specific…

Sound · Computer Science 2025-05-26 Natsuo Yamashita , Masaaki Yamamoto , Hiroaki Kokubo , Yohei Kawaguchi

Semi-supervised Learning for Code-Switching ASR with Large Language Model Filter

Code-switching (CS) phenomenon occurs when words or phrases from different languages are alternated in a single sentence. Due to data scarcity, building an effective CS Automatic Speech Recognition (ASR) system remains challenging. In this…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-23 Yu Xi , Wen Ding , Kai Yu , Junjie Lai

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Given recent advances in generative AI technology, a key question is how large language models (LLMs) can enhance acoustic modeling tasks using text decoding results from a frozen, pretrained automatic speech recognition (ASR) model. To…

Computation and Language · Computer Science 2025-12-02 Chao-Han Huck Yang , Taejin Park , Yuan Gong , Yuanchao Li , Zhehuai Chen , Yen-Ting Lin , Chen Chen , Yuchen Hu , Kunal Dhawan , Piotr Żelasko , Chao Zhang , Yun-Nung Chen , Yu Tsao , Jagadeesh Balam , Boris Ginsburg , Sabato Marco Siniscalchi , Eng Siong Chng , Peter Bell , Catherine Lai , Shinji Watanabe , Andreas Stolcke

MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition

Despite notable advancements in automatic speech recognition (ASR), performance tends to degrade when faced with adverse conditions. Generative error correction (GER) leverages the exceptional text comprehension capabilities of large…

Audio and Speech Processing · Electrical Eng. & Systems 2024-05-08 Bingshen Mu , Yangze Li , Qijie Shao , Kun Wei , Xucheng Wan , Naijun Zheng , Huan Zhou , Lei Xie

Code-switching Speech Recognition Under the Lens: Model- and Data-Centric Perspectives

Code-switching automatic speech recognition (CS-ASR) presents unique challenges due to language confusion introduced by spontaneous intra-sentence switching and accent bias that blurs the phonetic boundaries. Although the constituent…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-18 Hexin Liu , Haoyang Zhang , Qiquan Zhang , Xiangyu Zhang , Dongyuan Shi , Eng Siong Chng , Haizhou Li

Can Large Language Models Reliably Correct Errors in Low-Resource ASR? A Contamination-Aware Case Study on West Frisian

Automatic speech recognition (ASR) has improved substantially in recent years, yet performance remains limited for low-resource languages. Large language models (LLMs) have shown promise for improving ASR through generative error correction…

Computation and Language · Computer Science 2026-05-20 Yun Hao , Reihaneh Amooie , Wietse de Vries , Rik van Noord , Martijn Wieling

LLM-based Code-Switched Text Generation for Grammatical Error Correction

With the rise of globalisation, code-switching (CSW) has become a ubiquitous part of multilingual conversation, posing new challenges for natural language processing (NLP), especially in Grammatical Error Correction (GEC). This work…

Computation and Language · Computer Science 2024-10-15 Tom Potter , Zheng Yuan

Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition

We introduce a new cross-modal fusion technique designed for generative error correction in automatic speech recognition (ASR). Our methodology leverages both acoustic information and external linguistic representations to generate accurate…

Computation and Language · Computer Science 2025-12-02 Srijith Radhakrishnan , Chao-Han Huck Yang , Sumeer Ahmad Khan , Rohit Kumar , Narsis A. Kiani , David Gomez-Cabrero , Jesper N. Tegner

ASR Error Correction using Large Language Models

Error correction (EC) models play a crucial role in refining Automatic Speech Recognition (ASR) transcriptions, enhancing the readability and quality of transcriptions. Without requiring access to the underlying code or model weights, EC…

Computation and Language · Computer Science 2025-01-22 Rao Ma , Mengjie Qian , Mark Gales , Kate Knill

Revisiting ASR Error Correction with Specialized Models

Language models play a central role in automatic speech recognition (ASR), yet most methods rely on text-only models unaware of ASR error patterns. Recently, large language models (LLMs) have been applied to ASR correction, but introduce…

Machine Learning · Computer Science 2026-03-18 Zijin Gu , Tatiana Likhomanenko , He Bai , Erik McDermott , Ronan Collobert , Navdeep Jaitly

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which leverages the rich linguistic knowledge and powerful reasoning ability of LLMs to improve…

Computation and Language · Computer Science 2024-01-22 Yuchen Hu , Chen Chen , Chao-Han Huck Yang , Ruizhe Li , Chao Zhang , Pin-Yu Chen , EnSiong Chng

Speech LLMs are Contextual Reasoning Transcribers

Despite extensions to speech inputs, effectively leveraging the rich knowledge and contextual understanding of large language models (LLMs) in automatic speech recognition (ASR) remains non-trivial, as the task primarily involves direct…

Computation and Language · Computer Science 2026-04-02 Keqi Deng , Ruchao Fan , Bo Ren , Yiming Wang , Jinyu Li

Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition

The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical…

Computation and Language · Computer Science 2023-01-12 Amir Hussein , Shammur Absar Chowdhury , Ahmed Abdelali , Najim Dehak , Ahmed Ali , Sanjeev Khudanpur

Language-agnostic Code-Switching in Sequence-To-Sequence Speech Recognition

Code-Switching (CS) is referred to the phenomenon of alternately using words and phrases from different languages. While today's neural end-to-end (E2E) models deliver state-of-the-art performances on the task of automatic speech…

Computation and Language · Computer Science 2023-07-04 Enes Yavuz Ugan , Christian Huber , Juan Hussain , Alexander Waibel

Listening and Seeing Again: Generative Error Correction for Audio-Visual Speech Recognition

Unlike traditional Automatic Speech Recognition (ASR), Audio-Visual Speech Recognition (AVSR) takes audio and visual signals simultaneously to infer the transcription. Recent studies have shown that Large Language Models (LLMs) can be…

Multimedia · Computer Science 2025-01-09 Rui Liu , Hongyu Yuan , Haizhou Li

Evolutionary Prompt Design for LLM-Based Post-ASR Error Correction

Building upon the strength of modern large language models (LLMs), generative error correction (GEC) has emerged as a promising paradigm that can elevate the performance of modern automatic speech recognition (ASR) systems. One…

Computation and Language · Computer Science 2024-07-24 Rithik Sachdev , Zhong-Qiu Wang , Chao-Han Huck Yang