English
Related papers

Related papers: Towards Automatic Data Augmentation for Disordered…

200 papers

Automatic recognition of disordered speech remains a highly challenging task to date. The underlying neuro-motor conditions, often compounded with co-occurring physical disabilities, lead to the difficulty in collecting large quantities of…

Audio and Speech Processing · Electrical Eng. & Systems 2021-08-03 Zengrui Jin , Mengzhe Geng , Xurong Xie , Jianwei Yu , Shansong Liu , Xunying Liu , Helen Meng

Disordered speech recognition is a highly challenging task. The underlying neuro-motor conditions of people with speech disorders, often compounded with co-occurring physical disabilities, lead to the difficulty in collecting large…

Sound · Computer Science 2022-01-20 Mengzhe Geng , Xurong Xie , Shansong Liu , Jianwei Yu , Shoukang Hu , Xunying Liu , Helen Meng

Automatic recognition of dysarthric speech remains a highly challenging task to date. Neuro-motor conditions and co-occurring physical disabilities create difficulty in large-scale data collection for ASR system development. Adapting SSL…

Sound · Computer Science 2024-01-02 Huimeng Wang , Zengrui Jin , Mengzhe Geng , Shujie Hu , Guinan Li , Tianzi Wang , Haoning Xu , Xunying Liu

Automatic speech recognition (ASR) research has achieved impressive performance in recent years and has significant potential for enabling access for people with dysarthria (PwD) in augmentative and alternative communication (AAC) and home…

Sound · Computer Science 2024-06-14 Wing-Zin Leung , Mattias Cross , Anton Ragni , Stefan Goetze

We propose an on-the-fly data augmentation method for automatic speech recognition (ASR) that uses alignment information to generate effective training samples. Our method, called Aligned Data Augmentation (ADA) for ASR, replaces…

Computation and Language · Computer Science 2023-06-13 Tsz Kin Lam , Mayumi Ohta , Shigehiko Schamoni , Stefan Riezler

Despite the rapid progress of automatic speech recognition (ASR) technologies targeting normal speech, accurate recognition of dysarthric and elderly speech remains highly challenging tasks to date. It is difficult to collect large…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-05 Zengrui Jin , Mengzhe Geng , Jiajun Deng , Tianzi Wang , Shujie Hu , Guinan Li , Xunying Liu

While automatic speech recognition (ASR) greatly benefits from data augmentation, the augmentation recipes themselves tend to be heuristic. In this paper, we address one of the heuristic approach associated with balancing the right amount…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-17 Vishwanath Pratap Singh , Federico Malato , Ville Hautamaki , Md. Sahidullah , Tomi Kinnunen

In this work, we exploit speech enhancement for improving a recurrent neural network transducer (RNN-T) based ASR system. We employ a dense convolutional recurrent network (DCRN) for complex spectral mapping based speech enhancement, and…

Sound · Computer Science 2020-11-10 Ashutosh Pandey , Chunxi Liu , Yun Wang , Yatharth Saraf

Automatic speech recognition (ASR) systems often falter while processing stuttering-related disfluencies -- such as involuntary blocks and word repetitions -- yielding inaccurate transcripts. A critical barrier to progress is the scarcity…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-03 Dena Mujtaba , Nihar R. Mahapatra , Megan Arney , J. Scott Yaruss , Caryn Herring , Jia Bin

Psychoacoustic studies have shown that locally-time reversed (LTR) speech, i.e., signal samples time-reversed within a short segment, can be accurately recognised by human listeners. This study addresses the question of how well a…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-12 Si-Ioi Ng , Tan Lee

Automatic recognition of disordered speech remains a highly challenging task to date. The underlying neuro-motor conditions, often compounded with co-occurring physical disabilities, lead to the difficulty in collecting large quantities of…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-21 Zengrui Jin , Xurong Xie , Mengzhe Geng , Tianzi Wang , Shujie Hu , Jiajun Deng , Guinan Li , Xunying Liu

Nowadays, the main problem of deep learning techniques used in the development of automatic speech recognition (ASR) models is the lack of transcribed data. The goal of this research is to propose a new data augmentation method to improve…

Computation and Language · Computer Science 2022-04-04 Rodolfo Zevallos

Self-Supervised Learning (SSL) has allowed leveraging large amounts of unlabeled speech data to improve the performance of speech recognition models even with small annotated datasets. Despite this, speech SSL representations may fail while…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-02 Salah Zaiem , Titouan Parcollet , Slim Essid

Recent advances in text-to-speech (TTS) led to the development of flexible multi-speaker end-to-end TTS systems. We extend state-of-the-art attention-based automatic speech recognition (ASR) systems with synthetic audio generated by a TTS…

Computation and Language · Computer Science 2020-02-18 Nick Rossenbach , Albert Zeyer , Ralf Schlüter , Hermann Ney

Automatic recognition of disordered and elderly speech remains a highly challenging task to date due to the difficulty in collecting such data in large quantities. This paper explores a series of approaches to integrate domain adapted SSL…

Sound · Computer Science 2023-06-23 Shujie Hu , Xurong Xie , Zengrui Jin , Mengzhe Geng , Yi Wang , Mingyu Cui , Jiajun Deng , Xunying Liu , Helen Meng

Sequence-to-Sequence (S2S) models recently started to show state-of-the-art performance for automatic speech recognition (ASR). With these large and deep models overfitting remains the largest problem, outweighing performance improvements…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-04 Thai-Son Nguyen , Sebastian Stueker , Jan Niehues , Alex Waibel

Despite the rapid progress of automatic speech recognition (ASR) technologies in the past few decades, recognition of disordered speech remains a highly challenging task to date. Disordered speech presents a wide spectrum of challenges to…

Audio and Speech Processing · Electrical Eng. & Systems 2022-03-01 Shansong Liu , Mengzhe Geng , Shoukang Hu , Xurong Xie , Mingyu Cui , Jianwei Yu , Xunying Liu , Helen Meng

This paper investigates the use of unsupervised text-to-speech synthesis (TTS) as a data augmentation method to improve accented speech recognition. TTS systems are trained with a small amount of accented speech training data and their…

Computation and Language · Computer Science 2024-07-08 Cong-Thanh Do , Shuhei Imai , Rama Doddipatla , Thomas Hain

Speech-based virtual assistants, such as Amazon Alexa, Google assistant, and Apple Siri, typically convert users' audio signals to text data through automatic speech recognition (ASR) and feed the text to downstream dialog models for…

Computation and Language · Computer Science 2020-06-11 Longshaokan Wang , Maryam Fazel-Zarandi , Aditya Tiwari , Spyros Matsoukas , Lazaros Polymenakos

The performance of automatic speech recognition (ASR) systems has advanced substantially in recent years, particularly for languages for which a large amount of transcribed speech is available. Unfortunately, for low-resource languages,…

Computation and Language · Computer Science 2023-05-22 Martijn Bartelds , Nay San , Bradley McDonnell , Dan Jurafsky , Martijn Wieling
‹ Prev 1 2 3 10 Next ›