Related papers: Improving Self-supervised Pre-training using Accen…

Accented Speech Recognition With Accent-specific Codebooks

Speech accents pose a significant challenge to state-of-the-art automatic speech recognition (ASR) systems. Degradation in performance across underrepresented accents is a severe deterrent to the inclusive adoption of ASR. In this work, we…

Computation and Language · Computer Science 2023-10-30 Darshan Prabhu , Preethi Jyothi , Sriram Ganapathy , Vinit Unni

Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning

Recently, self-supervised pre-training has gained success in automatic speech recognition (ASR). However, considering the difference between speech accents in real scenarios, how to identify accents and use accent features to improve ASR is…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-16 Keqi Deng , Songjun Cao , Long Ma

Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training

Recently, masked prediction pre-training has seen remarkable progress in self-supervised learning (SSL) for speech recognition. It usually requires a codebook obtained in an unsupervised way, making it less accurate and difficult to…

Computation and Language · Computer Science 2022-06-22 Chengyi Wang , Yiming Wang , Yu Wu , Sanyuan Chen , Jinyu Li , Shujie Liu , Furu Wei

Improving Accented Speech Recognition with Multi-Domain Training

Thanks to the rise of self-supervised learning, automatic speech recognition (ASR) systems now achieve near-human performance on a wide variety of datasets. However, they still lack generalization capability and are not robust to domain…

Machine Learning · Computer Science 2023-03-15 Lucas Maison , Yannick Estève

Best of Both Worlds: Robust Accented Speech Recognition with Adversarial Transfer Learning

Training deep neural networks for automatic speech recognition (ASR) requires large amounts of transcribed speech. This becomes a bottleneck for training robust models for accented speech which typically contains high variability in…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-11 Nilaksh Das , Sravan Bodapati , Monica Sunkara , Sundararajan Srinivasan , Duen Horng Chau

Advancing African-Accented Speech Recognition: Epistemic Uncertainty-Driven Data Selection for Generalizable ASR Models

Accents play a pivotal role in shaping human communication, enhancing our ability to convey and comprehend messages with clarity and cultural nuance. While there has been significant progress in Automatic Speech Recognition (ASR),…

Computation and Language · Computer Science 2025-06-24 Bonaventure F. P. Dossou

Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters

Speech representations learned in a self-supervised fashion from massive unlabeled speech corpora have been adapted successfully toward several downstream tasks. However, such representations may be skewed toward canonical data…

Computation and Language · Computer Science 2023-07-04 Anshu Bhatia , Sanchit Sinha , Saket Dingliwal , Karthik Gopalakrishnan , Sravan Bodapati , Katrin Kirchhoff

Accent Recognition with Hybrid Phonetic Features

The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, the frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-06 Zhan Zhang , Xi Chen , Yuehai Wang , Jianyi Yang

How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario

The utilization of speech Self-Supervised Learning (SSL) models achieves impressive performance on Automatic Speech Recognition (ASR). However, in low-resource language ASR, they encounter the domain mismatch problem between pre-trained and…

Sound · Computer Science 2025-01-07 Shih-Heng Wang , Zih-Ching Chen , Jiatong Shi , Ming-To Chuang , Guan-Ting Lin , Kuan-Po Huang , David Harwath , Shang-Wen Li , Hung-yi Lee

Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units

Self-supervised pre-trained speech models have strongly improved speech recognition, yet they are still sensitive to domain shifts and accented or atypical speech. Many of these models rely on quantisation or clustering to learn discrete…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-06 Jakob Poncelet , Hugo Van hamme

Code-switching Speech Recognition Under the Lens: Model- and Data-Centric Perspectives

Code-switching automatic speech recognition (CS-ASR) presents unique challenges due to language confusion introduced by spontaneous intra-sentence switching and accent bias that blurs the phonetic boundaries. Although the constituent…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-18 Hexin Liu , Haoyang Zhang , Qiquan Zhang , Xiangyu Zhang , Dongyuan Shi , Eng Siong Chng , Haizhou Li

Adapting Automatic Speech Recognition for Accented Air Traffic Control Communications

Effective communication in Air Traffic Control (ATC) is critical to maintaining aviation safety, yet the challenges posed by accented English remain largely unaddressed in Automatic Speech Recognition (ASR) systems. Existing models struggle…

Machine Learning · Computer Science 2025-02-28 Marcus Yu Zhe Wee , Justin Juin Hng Wong , Lynus Lim , Joe Yu Wei Tan , Prannaya Gupta , Dillion Lim , En Hao Tew , Aloysius Keng Siew Han , Yong Zhi Lim

Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition

Accent variability has posed a huge challenge to automatic speech recognition~(ASR) modeling. Although one-hot accent vector based adaptation systems are commonly used, they require prior knowledge about the target accent and cannot handle…

Sound · Computer Science 2022-04-22 Xun Gong , Yizhou Lu , Zhikai Zhou , Yanmin Qian

Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition

The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical…

Computation and Language · Computer Science 2023-01-12 Amir Hussein , Shammur Absar Chowdhury , Ahmed Abdelali , Najim Dehak , Ahmed Ali , Sanjeev Khudanpur

Pushing the performances of ASR models on English and Spanish accents

Speech to text models tend to be trained and evaluated against a single target accent. This is especially true for English for which native speakers from the United States became the main benchmark. In this work, we are going to show how…

Computation and Language · Computer Science 2022-12-26 Pooja Chitkara , Morgane Riviere , Jade Copet , Frank Zhang , Yatharth Saraf

Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking

Pre-trained transformer-based models have significantly advanced automatic speech recognition (ASR), yet they remain sensitive to accent and dialectal variations, resulting in elevated word error rates (WER) in linguistically diverse…

Computation and Language · Computer Science 2025-10-13 Mohammad Hossein Sameti , Sepehr Harfi Moridani , Ali Zarean , Hossein Sameti

Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

In this paper, we present our overall efforts to improve the performance of a code-switching speech recognition system using semi-supervised training methods from lexicon learning to acoustic modeling, on the South East Asian…

Computation and Language · Computer Science 2018-06-19 Pengcheng Guo , Haihua Xu , Lei Xie , Eng Siong Chng

Learning Fast Adaptation on Cross-Accented Speech Recognition

Local dialects influence people to pronounce words of the same language differently from each other. The great variability and complex characteristics of accents creates a major challenge for training a robust and accent-agnostic automatic…

Audio and Speech Processing · Electrical Eng. & Systems 2020-03-05 Genta Indra Winata , Samuel Cahyawijaya , Zihan Liu , Zhaojiang Lin , Andrea Madotto , Peng Xu , Pascale Fung

English Accent Accuracy Analysis in a State-of-the-Art Automatic Speech Recognition System

Nowadays, research in speech technologies has gotten a lot out thanks to recently created public domain corpora that contain thousands of recording hours. These large amounts of data are very helpful for training the new complex models…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-12 Guillermo Cámbara , Alex Peiró-Lilja , Mireia Farrús , Jordi Luque

Synthetic Cross-accent Data Augmentation for Automatic Speech Recognition

The awareness for biased ASR datasets or models has increased notably in recent years. Even for English, despite a vast amount of available training data, systems perform worse for non-native speakers. In this work, we improve an…

Computation and Language · Computer Science 2023-03-03 Philipp Klumpp , Pooja Chitkara , Leda Sarı , Prashant Serai , Jilong Wu , Irina-Elena Veliche , Rongqing Huang , Qing He