Related papers: Combined Acoustic and Pronunciation Modelling for …

Am\'elioration des Performances des Syst\`emes Automatiques de Reconnaissance de la Parole pour la Parole Non Native

In this article, we present an approach for non native automatic speech recognition (ASR). We propose two methods to adapt existing ASR systems to the non-native accents. The first method is based on the modification of acoustic models…

Computation and Language · Computer Science 2007-11-08 Ghazi Bouselmi , Dominique Fohr , Irina Illina , Jean-Paul Haton

Data-Driven Mispronunciation Pattern Discovery for Robust Speech Recognition

Recent advancements in machine learning have significantly improved speech recognition, but recognizing speech from non-fluent or accented speakers remains a challenge. Previous efforts, relying on rule-based pronunciation patterns, have…

Computation and Language · Computer Science 2025-06-04 Anna Seo Gyeong Choi , Jonghyeon Park , Myungwoo Oh

Foreign English Accent Adjustment by Learning Phonetic Patterns

State-of-the-art automatic speech recognition (ASR) systems struggle with the lack of data for rare accents. For sufficiently large datasets, neural engines tend to outshine statistical models in most natural language processing problems.…

Sound · Computer Science 2018-07-11 Fedor Kitashov , Elizaveta Svitanko , Debojyoti Dutta

A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech Recognition

Despite the success of deep learning in speech recognition, multi-dialect speech recognition remains a difficult problem. Although dialect-specific acoustic models are known to perform well in general, they are not easy to maintain when…

Machine Learning · Computer Science 2022-05-09 Sanghyun Yoo , Inchul Song , Yoshua Bengio

Adaptive Frequency Cepstral Coefficients for Word Mispronunciation Detection

Systems based on automatic speech recognition (ASR) technology can provide important functionality in computer assisted language learning applications. This is a young but growing area of research motivated by the large number of students…

Sound · Computer Science 2016-02-29 Zhenhao Ge , Sudhendu R. Sharma , Mark J. T. Smith

Multilingual Approach to Joint Speech and Accent Recognition with DNN-HMM Framework

Human can recognize speech, as well as the peculiar accent of the speech simultaneously. However, present state-of-the-art ASR system can rarely do that. In this paper, we propose a multilingual approach to recognizing English speech, and…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-11 Yizhou Peng , Jicheng Zhang , Haobo Zhang , Haihua Xu , Hao Huang , Eng Siong Chng

Improved Accent Classification Combining Phonetic Vowels with Acoustic Features

Researches have shown accent classification can be improved by integrating semantic information into pure acoustic approach. In this work, we combine phonetic knowledge, such as vowels, with enhanced acoustic features to build an improved…

Sound · Computer Science 2016-02-25 Zhenhao Ge

Improving Language Identification of Accented Speech

Language identification from speech is a common preprocessing step in many spoken language processing systems. In recent years, this field has seen fast progress, mostly due to the use of self-supervised models pretrained on multilingual…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-04 Kunnar Kukk , Tanel Alumäe

Early Stage LM Integration Using Local and Global Log-Linear Combination

Sequence-to-sequence models with an implicit alignment mechanism (e.g. attention) are closing the performance gap towards traditional hybrid hidden Markov models (HMM) for the task of automatic speech recognition. One important factor to…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-21 Wilfried Michel , Ralf Schlüter , Hermann Ney

Accent Conversion with Articulatory Representations

Conversion of non-native accented speech to native (American) English has a wide range of applications such as improving intelligibility of non-native speech. Previous work on this domain has used phonetic posteriograms as the target speech…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-11 Yashish M. Siriwardena , Nathan Swedlow , Audrey Howard , Evan Gitterman , Dan Darcy , Carol Espy-Wilson , Andrea Fanelli

Automated detection of pronunciation errors in non-native English speech employing deep learning

Despite significant advances in recent years, the existing Computer-Assisted Pronunciation Training (CAPT) methods detect pronunciation errors with a relatively low accuracy (precision of 60% at 40%-80% recall). This Ph.D. work proposes…

Audio and Speech Processing · Electrical Eng. & Systems 2022-09-15 Daniel Korzekwa

Private Language Model Adaptation for Speech Recognition

Speech model adaptation is crucial to handle the discrepancy between server-side proxy training data and actual data received on local devices of users. With the use of federated learning (FL), we introduce an efficient approach on…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-16 Zhe Liu , Ke Li , Shreyan Bakshi , Fuchun Peng

Improving Speech Recognition Error Prediction for Modern and Off-the-shelf Speech Recognizers

Modeling the errors of a speech recognizer can help simulate errorful recognized speech data from plain text, which has proven useful for tasks like discriminative language modeling, improving robustness of NLP systems, where limited or…

Artificial Intelligence · Computer Science 2024-08-22 Prashant Serai , Peidong Wang , Eric Fosler-Lussier

Language Modelling for Speaker Diarization in Telephonic Interviews

The aim of this paper is to investigate the benefit of combining both language and acoustic modelling for speaker diarization. Although conventional systems only use acoustic features, in some scenarios linguistic data contain high…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-31 Miquel India , Javier Hernando , José A. R. Fonollosa

Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study

This paper explores the integration of Large Language Models (LLMs) into Automatic Speech Recognition (ASR) systems to improve transcription accuracy. The increasing sophistication of LLMs, with their in-context learning capabilities and…

Computation and Language · Computer Science 2025-06-03 Zeping Min , Jinbo Wang

Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints

We propose a first step toward multilingual end-to-end automatic speech recognition (ASR) by integrating knowledge about speech articulators. The key idea is to leverage a rich set of fundamental units that can be defined "universally"…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-19 Hao Yen , Sabato Marco Siniscalchi , Chin-Hui Lee

L1-aware Multilingual Mispronunciation Detection Framework

The phonological discrepancies between a speaker's native (L1) and the non-native language (L2) serves as a major factor for mispronunciation. This paper introduces a novel multilingual MDD architecture, L1-MultiMDD, enriched with L1-aware…

Computation and Language · Computer Science 2023-09-22 Yassine El Kheir , Shammur Absar Chowdhury , Ahmed Ali

Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages

Automatic speech recognition systems have undoubtedly advanced with the integration of multilingual and multitask models such as Whisper, which have shown a promising ability to understand and process speech across a wide range of…

Computation and Language · Computer Science 2025-04-14 Xabier de Zuazo , Eva Navas , Ibon Saratxaga , Inma Hernáez Rioja

A multi-view approach for Mandarin non-native mispronunciation verification

Traditionally, the performance of non-native mispronunciation verification systems relied on effective phone-level labelling of non-native corpora. In this study, a multi-view approach is proposed to incorporate discriminative feature…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-10 Zhenyu Wang , John H. L. Hansen , Yanlu Xie

Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition

The limited availability of non-native speech datasets presents a major challenge in automatic speech recognition (ASR) to narrow the performance gap between native and non-native speakers. To address this, the focus of this study is on the…

Computation and Language · Computer Science 2023-06-06 Jisung Wang , Haram Lee , Myungwoo Oh