English
Related papers

Related papers: Improving Self-supervised Pre-training using Accen…

200 papers

Speech accents pose a significant challenge to state-of-the-art automatic speech recognition (ASR) systems. Degradation in performance across underrepresented accents is a severe deterrent to the inclusive adoption of ASR. In this work, we…

Computation and Language · Computer Science 2023-10-30 Darshan Prabhu , Preethi Jyothi , Sriram Ganapathy , Vinit Unni

Recently, self-supervised pre-training has gained success in automatic speech recognition (ASR). However, considering the difference between speech accents in real scenarios, how to identify accents and use accent features to improve ASR is…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-16 Keqi Deng , Songjun Cao , Long Ma

Recently, masked prediction pre-training has seen remarkable progress in self-supervised learning (SSL) for speech recognition. It usually requires a codebook obtained in an unsupervised way, making it less accurate and difficult to…

Computation and Language · Computer Science 2022-06-22 Chengyi Wang , Yiming Wang , Yu Wu , Sanyuan Chen , Jinyu Li , Shujie Liu , Furu Wei

Thanks to the rise of self-supervised learning, automatic speech recognition (ASR) systems now achieve near-human performance on a wide variety of datasets. However, they still lack generalization capability and are not robust to domain…

Machine Learning · Computer Science 2023-03-15 Lucas Maison , Yannick Estève

Training deep neural networks for automatic speech recognition (ASR) requires large amounts of transcribed speech. This becomes a bottleneck for training robust models for accented speech which typically contains high variability in…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-11 Nilaksh Das , Sravan Bodapati , Monica Sunkara , Sundararajan Srinivasan , Duen Horng Chau

Accents play a pivotal role in shaping human communication, enhancing our ability to convey and comprehend messages with clarity and cultural nuance. While there has been significant progress in Automatic Speech Recognition (ASR),…

Computation and Language · Computer Science 2025-06-24 Bonaventure F. P. Dossou

Speech representations learned in a self-supervised fashion from massive unlabeled speech corpora have been adapted successfully toward several downstream tasks. However, such representations may be skewed toward canonical data…

Computation and Language · Computer Science 2023-07-04 Anshu Bhatia , Sanchit Sinha , Saket Dingliwal , Karthik Gopalakrishnan , Sravan Bodapati , Katrin Kirchhoff

The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, the frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-06 Zhan Zhang , Xi Chen , Yuehai Wang , Jianyi Yang

The utilization of speech Self-Supervised Learning (SSL) models achieves impressive performance on Automatic Speech Recognition (ASR). However, in low-resource language ASR, they encounter the domain mismatch problem between pre-trained and…

Self-supervised pre-trained speech models have strongly improved speech recognition, yet they are still sensitive to domain shifts and accented or atypical speech. Many of these models rely on quantisation or clustering to learn discrete…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-06 Jakob Poncelet , Hugo Van hamme

Code-switching automatic speech recognition (CS-ASR) presents unique challenges due to language confusion introduced by spontaneous intra-sentence switching and accent bias that blurs the phonetic boundaries. Although the constituent…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-18 Hexin Liu , Haoyang Zhang , Qiquan Zhang , Xiangyu Zhang , Dongyuan Shi , Eng Siong Chng , Haizhou Li

Effective communication in Air Traffic Control (ATC) is critical to maintaining aviation safety, yet the challenges posed by accented English remain largely unaddressed in Automatic Speech Recognition (ASR) systems. Existing models struggle…

Accent variability has posed a huge challenge to automatic speech recognition~(ASR) modeling. Although one-hot accent vector based adaptation systems are commonly used, they require prior knowledge about the target accent and cannot handle…

Sound · Computer Science 2022-04-22 Xun Gong , Yizhou Lu , Zhikai Zhou , Yanmin Qian

The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical…

Computation and Language · Computer Science 2023-01-12 Amir Hussein , Shammur Absar Chowdhury , Ahmed Abdelali , Najim Dehak , Ahmed Ali , Sanjeev Khudanpur

Speech to text models tend to be trained and evaluated against a single target accent. This is especially true for English for which native speakers from the United States became the main benchmark. In this work, we are going to show how…

Computation and Language · Computer Science 2022-12-26 Pooja Chitkara , Morgane Riviere , Jade Copet , Frank Zhang , Yatharth Saraf

Pre-trained transformer-based models have significantly advanced automatic speech recognition (ASR), yet they remain sensitive to accent and dialectal variations, resulting in elevated word error rates (WER) in linguistically diverse…

Computation and Language · Computer Science 2025-10-13 Mohammad Hossein Sameti , Sepehr Harfi Moridani , Ali Zarean , Hossein Sameti

In this paper, we present our overall efforts to improve the performance of a code-switching speech recognition system using semi-supervised training methods from lexicon learning to acoustic modeling, on the South East Asian…

Computation and Language · Computer Science 2018-06-19 Pengcheng Guo , Haihua Xu , Lei Xie , Eng Siong Chng

Local dialects influence people to pronounce words of the same language differently from each other. The great variability and complex characteristics of accents creates a major challenge for training a robust and accent-agnostic automatic…

Audio and Speech Processing · Electrical Eng. & Systems 2020-03-05 Genta Indra Winata , Samuel Cahyawijaya , Zihan Liu , Zhaojiang Lin , Andrea Madotto , Peng Xu , Pascale Fung

Nowadays, research in speech technologies has gotten a lot out thanks to recently created public domain corpora that contain thousands of recording hours. These large amounts of data are very helpful for training the new complex models…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-12 Guillermo Cámbara , Alex Peiró-Lilja , Mireia Farrús , Jordi Luque

The awareness for biased ASR datasets or models has increased notably in recent years. Even for English, despite a vast amount of available training data, systems perform worse for non-native speakers. In this work, we improve an…

Computation and Language · Computer Science 2023-03-03 Philipp Klumpp , Pooja Chitkara , Leda Sarı , Prashant Serai , Jilong Wu , Irina-Elena Veliche , Rongqing Huang , Qing He
‹ Prev 1 2 3 10 Next ›