Related papers: ASR Error Correction using Large Language Models

Robust ASR Error Correction with Conservative Data Filtering

Error correction (EC) based on large language models is an emerging technology to enhance the performance of automatic speech recognition (ASR) systems. Generally, training data for EC are collected by automatically pairing a large set of…

Computation and Language · Computer Science 2024-10-17 Takuma Udagawa , Masayuki Suzuki , Masayasu Muraoka , Gakuto Kurata

Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study

This paper explores the integration of Large Language Models (LLMs) into Automatic Speech Recognition (ASR) systems to improve transcription accuracy. The increasing sophistication of LLMs, with their in-context learning capabilities and…

Computation and Language · Computer Science 2025-06-03 Zeping Min , Jinbo Wang

Large Language Models based ASR Error Correction for Child Conversations

Automatic Speech Recognition (ASR) has recently shown remarkable progress, but accurately transcribing children's speech remains a significant challenge. Recent developments in Large Language Models (LLMs) have shown promise in improving…

Computation and Language · Computer Science 2025-05-27 Anfeng Xu , Tiantian Feng , So Hyun Kim , Somer Bishop , Catherine Lord , Shrikanth Narayanan

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition

We propose to utilize an instruction-tuned large language model (LLM) for guiding the text generation process in automatic speech recognition (ASR). Modern large language models (LLMs) are adept at performing various text generation tasks…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-08 Yosuke Higuchi , Tetsuji Ogawa , Tetsunori Kobayashi

Can Generative Large Language Models Perform ASR Error Correction?

ASR error correction is an interesting option for post processing speech recognition system outputs. These error correction models are usually trained in a supervised fashion using the decoding results of a target ASR system. This approach…

Computation and Language · Computer Science 2023-10-02 Rao Ma , Mengjie Qian , Potsawee Manakul , Mark Gales , Kate Knill

ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction

Automatic speech Recognition (ASR) is a fundamental and important task in the field of speech and natural language processing. It is an inherent building block in many applications such as voice assistant, speech translation, etc. Despite…

Computation and Language · Computer Science 2024-12-05 Victor Junqiu Wei , Weicheng Wang , Di Jiang , Yuanfeng Song , Lu Wang

Revisiting ASR Error Correction with Specialized Models

Language models play a central role in automatic speech recognition (ASR), yet most methods rely on text-only models unaware of ASR error patterns. Recently, large language models (LLMs) have been applied to ASR correction, but introduce…

Machine Learning · Computer Science 2026-03-18 Zijin Gu , Tatiana Likhomanenko , He Bai , Erik McDermott , Ronan Collobert , Navdeep Jaitly

Multi-stage Large Language Model Correction for Speech Recognition

In this paper, we investigate the usage of large language models (LLMs) to improve the performance of competitive speech recognition systems. Different from previous LLM-based ASR error correction methods, we propose a novel multi-stage…

Computation and Language · Computer Science 2024-06-18 Jie Pu , Thai-Son Nguyen , Sebastian Stüker

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which leverages the rich linguistic knowledge and powerful reasoning ability of LLMs to improve…

Computation and Language · Computer Science 2024-01-22 Yuchen Hu , Chen Chen , Chao-Han Huck Yang , Ruizhe Li , Chao Zhang , Pin-Yu Chen , EnSiong Chng

An approach to measuring the performance of Automatic Speech Recognition (ASR) models in the context of Large Language Model (LLM) powered applications

Automatic Speech Recognition (ASR) plays a crucial role in human-machine interaction and serves as an interface for a wide range of applications. Traditionally, ASR performance has been evaluated using Word Error Rate (WER), a metric that…

Audio and Speech Processing · Electrical Eng. & Systems 2025-07-23 Sujith Pulikodan , Sahapthan K , Prasanta Kumar Ghosh , Visruth Sanka , Nihar Desai

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Given recent advances in generative AI technology, a key question is how large language models (LLMs) can enhance acoustic modeling tasks using text decoding results from a frozen, pretrained automatic speech recognition (ASR) model. To…

Computation and Language · Computer Science 2025-12-02 Chao-Han Huck Yang , Taejin Park , Yuan Gong , Yuanchao Li , Zhehuai Chen , Yen-Ting Lin , Chen Chen , Yuchen Hu , Kunal Dhawan , Piotr Żelasko , Chao Zhang , Yun-Nung Chen , Yu Tsao , Jagadeesh Balam , Boris Ginsburg , Sabato Marco Siniscalchi , Eng Siong Chng , Peter Bell , Catherine Lai , Shinji Watanabe , Andreas Stolcke

Generative error correction for code-switching speech recognition using large language models

Code-switching (CS) speech refers to the phenomenon of mixing two or more languages within the same sentence. Despite the recent advances in automatic speech recognition (ASR), CS-ASR is still a challenging task ought to the grammatical…

Computation and Language · Computer Science 2023-10-23 Chen Chen , Yuchen Hu , Chao-Han Huck Yang , Hexin Liu , Sabato Marco Siniscalchi , Eng Siong Chng

Customizing Speech Recognition Model with Large Language Model Feedback

Automatic speech recognition (ASR) systems have achieved strong performance on general transcription tasks. However, they continue to struggle with recognizing rare named entities and adapting to domain mismatches. In contrast, large…

Computation and Language · Computer Science 2025-08-21 Shaoshi Ling , Guoli Ye

Chain of Correction for Full-text Speech Recognition with Large Language Models

Full-text error correction with Large Language Models (LLMs) for Automatic Speech Recognition (ASR) is attracting increased attention for its ability to address a wide range of error types, such as punctuation restoration and inverse text…

Computation and Language · Computer Science 2026-03-03 Zhiyuan Tang , Dong Wang , Zhikai Zhou , Yong Liu , Shen Huang , Shidong Shang

The Sound of Healthcare: Improving Medical Transcription ASR Accuracy with Large Language Models

In the rapidly evolving landscape of medical documentation, transcribing clinical dialogues accurately is increasingly paramount. This study explores the potential of Large Language Models (LLMs) to enhance the accuracy of Automatic Speech…

Computation and Language · Computer Science 2024-02-13 Ayo Adedeji , Sarita Joshi , Brendan Doohan

Correction Focused Language Model Training for Speech Recognition

Language models (LMs) have been commonly adopted to boost the performance of automatic speech recognition (ASR) particularly in domain adaptation tasks. Conventional way of LM training treats all the words in corpora equally, resulting in…

Computation and Language · Computer Science 2023-10-18 Yingyi Ma , Zhe Liu , Ozlem Kalinli

Speech LLMs are Contextual Reasoning Transcribers

Despite extensions to speech inputs, effectively leveraging the rich knowledge and contextual understanding of large language models (LLMs) in automatic speech recognition (ASR) remains non-trivial, as the task primarily involves direct…

Computation and Language · Computer Science 2026-04-02 Keqi Deng , Ruchao Fan , Bo Ren , Yiming Wang , Jinyu Li

A Comprehensive Solution to Connect Speech Encoder and Large Language Model for ASR

Recent works have shown promising results in connecting speech encoders to large language models (LLMs) for speech recognition. However, several limitations persist, including limited fine-tuning options, a lack of mechanisms to enforce…

Machine Learning · Computer Science 2024-06-26 Van Tung Pham , Yist Lin , Tao Han , Wei Li , Jun Zhang , Lu Lu , Yuxuan Wang

HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models

Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to attain human parity on several publicly available clean speech datasets. However, even state-of-the-art ASR systems experience performance…

Computation and Language · Computer Science 2023-10-17 Chen Chen , Yuchen Hu , Chao-Han Huck Yang , Sabato Macro Siniscalchi , Pin-Yu Chen , Eng Siong Chng

Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition

Most end-to-end (E2E) speech recognition models are composed of encoder and decoder blocks that perform acoustic and language modeling functions. Pretrained large language models (LLMs) have the potential to improve the performance of E2E…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-04 Shaoshi Ling , Yuxuan Hu , Shuangbei Qian , Guoli Ye , Yao Qian , Yifan Gong , Ed Lin , Michael Zeng