English
Related papers

Related papers: FluentSpeech: Stutter-Oriented Automatic Speech Ed…

200 papers

Text-based speech editing (TSE) techniques are designed to enable users to edit the output audio by modifying the input text transcript instead of the audio itself. Despite much progress in neural network-based TSE techniques, the current…

Sound · Computer Science 2023-09-25 Rui Liu , Jiatian Xi , Ziyue Jiang , Haizhou Li

Text-based speech editing (TSE) allows users to edit speech by modifying the corresponding text directly without altering the original recording. Current TSE techniques often focus on minimizing discrepancies between generated speech and…

Computation and Language · Computer Science 2024-12-10 Rui Liu , Jiatian Xi , Ziyue Jiang , Haizhou Li

Strong presentation skills are valuable and sought-after in workplace and classroom environments alike. Of the possible improvements to vocal presentations, disfluencies and stutters in particular remain one of the most common and prominent…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-25 Tedd Kourkounakis , Amirhossein Hajavi , Ali Etemad

Stuttering is a speech disorder which impacts the personal and professional lives of millions of people worldwide. To save themselves from stigma and discrimination, people who stutter (PWS) may adopt different strategies to conceal their…

Artificial Intelligence · Computer Science 2021-08-24 Bhavya Ghai , Klaus Mueller

Clinical diagnosis of stuttering requires an assessment by a licensed speech-language pathologist. However, this process is time-consuming and requires clinicians with training and experience in stuttering and fluency disorders.…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-18 Yi-Jen Shih , Zoi Gkalitsiou , Alexandros G. Dimakis , David Harwath

Over 70 million people worldwide experience stuttering, yet most automatic speech systems misinterpret disfluent utterances or fail to transcribe them accurately. Existing methods for stutter correction rely on handcrafted feature…

Audio and Speech Processing · Electrical Eng. & Systems 2025-11-06 Qianheng Xu

Speech disfluencies, such as filled pauses or repetitions, are disruptions in the typical flow of speech. Stuttering is a speech disorder characterized by a high rate of disfluencies, but all individuals speak with some disfluencies and the…

Audio and Speech Processing · Electrical Eng. & Systems 2023-11-03 Amrit Romana , Kazuhito Koishida , Emily Mower Provost

The automated classification of stuttered speech has significant implications for timely assessments providing assistance to speech language pathologists. Despite notable advancements in the field, the cases in which multiple disfluencies…

Sound · Computer Science 2025-02-27 Huma Ameer , Seemab Latif , Mehwish Fatima

People who stutter (PWS) face systemic exclusion in today's voice-driven society, where access to voice assistants, authentication systems, and remote work tools increasingly depends on fluent speech. Current automatic speech recognition…

Computers and Society · Computer Science 2026-01-16 Ziqi Xu , Yi Liu , Yuekang Li , Ling Shi , Kailong Wang , Yongxin Zhao

Stuttering is a varied speech disorder that harms an individual's communication ability. Persons who stutter (PWS) often use speech therapy to cope with their condition. Improving speech recognition systems for people with such non-typical…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-27 Sebastian P. Bayerl , Dominik Wagner , Elmar Nöth , Korbinian Riedhammer

Stuttering is a speech disorder where the natural flow of speech is interrupted by blocks, repetitions or prolongations of syllables, words and phrases. The majority of existing automatic speech recognition (ASR) interfaces perform poorly…

Detecting and segmenting dysfluencies is crucial for effective speech therapy and real-time feedback. However, most methods only classify dysfluencies at the utterance level. We introduce StutterCut, a semi-supervised framework that…

Sound · Computer Science 2025-08-05 Suhita Ghosh , Melanie Jouaiti , Jan-Ole Perschewski , Sebastian Stober

The generation of natural and high-quality speech from text is a challenging problem in the field of natural language processing. In addition to speech generation, speech editing is also a crucial task, which requires the seamless and…

Audio and Speech Processing · Electrical Eng. & Systems 2024-03-11 Antonios Alexos , Pierre Baldi

With the rise of video production and social media, speech editing has become crucial for creators to address issues like mispronunciations, missing words, or stuttering in audio recordings. This paper explores text-based speech editing…

Sound · Computer Science 2024-07-25 Tobias Kässmann , Yining Liu , Danni Liu

In recent years, advancements in the field of speech processing have led to cutting-edge deep learning algorithms with immense potential for real-world applications. The automated identification of stuttered speech is one of such…

Sound · Computer Science 2023-11-10 Huma Ameer , Seemab Latif , Rabia Latif , Sana Mukhtar

As text-based speech editing becomes increasingly prevalent, the demand for unrestricted free-text editing continues to grow. However, existing speech editing techniques encounter significant challenges, particularly in maintaining…

Sound · Computer Science 2024-09-23 Yang Chen , Yuhang Jia , Shiwan Zhao , Ziyue Jiang , Haoran Li , Jiarong Kang , Yong Qin

Diffusion-based Generative AI gains significant attention for its superior performance over other generative techniques like Generative Adversarial Networks and Variational Autoencoders. While it has achieved notable advancements in fields…

Sound · Computer Science 2024-12-12 Haowei Lou , Helen Paik , Pari Delir Haghighi , Wen Hu , Lina Yao

Automatic speech recognition systems have achieved remarkable performance on fluent speech but continue to degrade significantly when processing stuttered speech, a limitation that is particularly acute for low-resource languages like…

Computation and Language · Computer Science 2026-01-15 Fadhil Muhammad , Alwin Djuliansah , Adrian Aryaputra Hamzah , Kurniawati Azizah

Dysfluencies and variations in speech pronunciation can severely degrade speech recognition performance, and for many individuals with moderate-to-severe speech disorders, voice operated systems do not work. Current speech recognition…

With the fast development of zero-shot text-to-speech technologies, it is possible to generate high-quality speech signals that are indistinguishable from the real ones. Speech editing, including speech insertion and replacement, appeals to…

Audio and Speech Processing · Electrical Eng. & Systems 2026-05-19 Kuan-Yu Chen , Jeng-Lin Li , De-Yan Lu , Jian-Jiun Ding
‹ Prev 1 2 3 10 Next ›