English
Related papers

Related papers: ASR Error Correction using Large Language Models

200 papers

Error correction (EC) based on large language models is an emerging technology to enhance the performance of automatic speech recognition (ASR) systems. Generally, training data for EC are collected by automatically pairing a large set of…

Computation and Language · Computer Science 2024-10-17 Takuma Udagawa , Masayuki Suzuki , Masayasu Muraoka , Gakuto Kurata

This paper explores the integration of Large Language Models (LLMs) into Automatic Speech Recognition (ASR) systems to improve transcription accuracy. The increasing sophistication of LLMs, with their in-context learning capabilities and…

Computation and Language · Computer Science 2025-06-03 Zeping Min , Jinbo Wang

Automatic Speech Recognition (ASR) has recently shown remarkable progress, but accurately transcribing children's speech remains a significant challenge. Recent developments in Large Language Models (LLMs) have shown promise in improving…

Computation and Language · Computer Science 2025-05-27 Anfeng Xu , Tiantian Feng , So Hyun Kim , Somer Bishop , Catherine Lord , Shrikanth Narayanan

We propose to utilize an instruction-tuned large language model (LLM) for guiding the text generation process in automatic speech recognition (ASR). Modern large language models (LLMs) are adept at performing various text generation tasks…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-08 Yosuke Higuchi , Tetsuji Ogawa , Tetsunori Kobayashi

ASR error correction is an interesting option for post processing speech recognition system outputs. These error correction models are usually trained in a supervised fashion using the decoding results of a target ASR system. This approach…

Computation and Language · Computer Science 2023-10-02 Rao Ma , Mengjie Qian , Potsawee Manakul , Mark Gales , Kate Knill

Automatic speech Recognition (ASR) is a fundamental and important task in the field of speech and natural language processing. It is an inherent building block in many applications such as voice assistant, speech translation, etc. Despite…

Computation and Language · Computer Science 2024-12-05 Victor Junqiu Wei , Weicheng Wang , Di Jiang , Yuanfeng Song , Lu Wang

Language models play a central role in automatic speech recognition (ASR), yet most methods rely on text-only models unaware of ASR error patterns. Recently, large language models (LLMs) have been applied to ASR correction, but introduce…

Machine Learning · Computer Science 2026-03-18 Zijin Gu , Tatiana Likhomanenko , He Bai , Erik McDermott , Ronan Collobert , Navdeep Jaitly

In this paper, we investigate the usage of large language models (LLMs) to improve the performance of competitive speech recognition systems. Different from previous LLM-based ASR error correction methods, we propose a novel multi-stage…

Computation and Language · Computer Science 2024-06-18 Jie Pu , Thai-Son Nguyen , Sebastian Stüker

Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which leverages the rich linguistic knowledge and powerful reasoning ability of LLMs to improve…

Computation and Language · Computer Science 2024-01-22 Yuchen Hu , Chen Chen , Chao-Han Huck Yang , Ruizhe Li , Chao Zhang , Pin-Yu Chen , EnSiong Chng

Automatic Speech Recognition (ASR) plays a crucial role in human-machine interaction and serves as an interface for a wide range of applications. Traditionally, ASR performance has been evaluated using Word Error Rate (WER), a metric that…

Audio and Speech Processing · Electrical Eng. & Systems 2025-07-23 Sujith Pulikodan , Sahapthan K , Prasanta Kumar Ghosh , Visruth Sanka , Nihar Desai

Given recent advances in generative AI technology, a key question is how large language models (LLMs) can enhance acoustic modeling tasks using text decoding results from a frozen, pretrained automatic speech recognition (ASR) model. To…

Code-switching (CS) speech refers to the phenomenon of mixing two or more languages within the same sentence. Despite the recent advances in automatic speech recognition (ASR), CS-ASR is still a challenging task ought to the grammatical…

Computation and Language · Computer Science 2023-10-23 Chen Chen , Yuchen Hu , Chao-Han Huck Yang , Hexin Liu , Sabato Marco Siniscalchi , Eng Siong Chng

Automatic speech recognition (ASR) systems have achieved strong performance on general transcription tasks. However, they continue to struggle with recognizing rare named entities and adapting to domain mismatches. In contrast, large…

Computation and Language · Computer Science 2025-08-21 Shaoshi Ling , Guoli Ye

Full-text error correction with Large Language Models (LLMs) for Automatic Speech Recognition (ASR) is attracting increased attention for its ability to address a wide range of error types, such as punctuation restoration and inverse text…

Computation and Language · Computer Science 2026-03-03 Zhiyuan Tang , Dong Wang , Zhikai Zhou , Yong Liu , Shen Huang , Shidong Shang

In the rapidly evolving landscape of medical documentation, transcribing clinical dialogues accurately is increasingly paramount. This study explores the potential of Large Language Models (LLMs) to enhance the accuracy of Automatic Speech…

Computation and Language · Computer Science 2024-02-13 Ayo Adedeji , Sarita Joshi , Brendan Doohan

Language models (LMs) have been commonly adopted to boost the performance of automatic speech recognition (ASR) particularly in domain adaptation tasks. Conventional way of LM training treats all the words in corpora equally, resulting in…

Computation and Language · Computer Science 2023-10-18 Yingyi Ma , Zhe Liu , Ozlem Kalinli

Despite extensions to speech inputs, effectively leveraging the rich knowledge and contextual understanding of large language models (LLMs) in automatic speech recognition (ASR) remains non-trivial, as the task primarily involves direct…

Computation and Language · Computer Science 2026-04-02 Keqi Deng , Ruchao Fan , Bo Ren , Yiming Wang , Jinyu Li

Recent works have shown promising results in connecting speech encoders to large language models (LLMs) for speech recognition. However, several limitations persist, including limited fine-tuning options, a lack of mechanisms to enforce…

Machine Learning · Computer Science 2024-06-26 Van Tung Pham , Yist Lin , Tao Han , Wei Li , Jun Zhang , Lu Lu , Yuxuan Wang

Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to attain human parity on several publicly available clean speech datasets. However, even state-of-the-art ASR systems experience performance…

Computation and Language · Computer Science 2023-10-17 Chen Chen , Yuchen Hu , Chao-Han Huck Yang , Sabato Macro Siniscalchi , Pin-Yu Chen , Eng Siong Chng

Most end-to-end (E2E) speech recognition models are composed of encoder and decoder blocks that perform acoustic and language modeling functions. Pretrained large language models (LLMs) have the potential to improve the performance of E2E…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-04 Shaoshi Ling , Yuxuan Hu , Shuangbei Qian , Guoli Ye , Yao Qian , Yifan Gong , Ed Lin , Michael Zeng
‹ Prev 1 2 3 10 Next ›