Related papers: Selfsupervised learning for pathological speech de…

Self-Supervised Embeddings for Detecting Individual Symptoms of Depression

Depression, a prevalent mental health disorder impacting millions globally, demands reliable assessment systems. Unlike previous studies that focus solely on either detecting depression or predicting its severity, our work identifies…

Sound · Computer Science 2024-06-26 Sri Harsha Dumpala , Katerina Dikaios , Abraham Nunes , Frank Rudzicz , Rudolf Uher , Sageev Oore

Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition

We investigate the performance of self-supervised pretraining frameworks on pathological speech datasets used for automatic speech recognition (ASR). Modern end-to-end models require thousands of hours of data to train well, but only a…

Sound · Computer Science 2022-06-30 Lester Phillip Violeta , Wen-Chin Huang , Tomoki Toda

Unveiling Interpretability in Self-Supervised Speech Representations for Parkinson's Diagnosis

Recent works in pathological speech analysis have increasingly relied on powerful self-supervised speech representations, leading to promising results. However, the complex, black-box nature of these embeddings and the limited research on…

Computer Vision and Pattern Recognition · Computer Science 2025-02-11 David Gimeno-Gómez , Catarina Botelho , Anna Pompili , Alberto Abad , Carlos-D. Martínez-Hinarejos

Supervised Speech Representation Learning for Parkinson's Disease Classification

Recently proposed automatic pathological speech classification techniques use unsupervised auto-encoders to obtain a high-level abstract representation of speech. Since these representations are learned based on reconstructing the input,…

Audio and Speech Processing · Electrical Eng. & Systems 2021-07-22 Parvaneh Janbakhshi , Ina Kodrasi

Self-Supervised Speech Representation Learning: A Review

Although supervised deep learning has revolutionized speech and audio processing, it has necessitated the building of specialist models for individual tasks and application scenarios. It is likewise difficult to apply this to dialects and…

Computation and Language · Computer Science 2022-11-23 Abdelrahman Mohamed , Hung-yi Lee , Lasse Borgholt , Jakob D. Havtorn , Joakim Edin , Christian Igel , Katrin Kirchhoff , Shang-Wen Li , Karen Livescu , Lars Maaløe , Tara N. Sainath , Shinji Watanabe

Innovative Speech-Based Deep Learning Approaches for Parkinson's Disease Classification: A Systematic Review

Parkinson's disease (PD), the second most prevalent neurodegenerative disorder worldwide, frequently presents with early-stage speech impairments. Recent advancements in Artificial Intelligence (AI), particularly deep learning (DL), have…

Sound · Computer Science 2024-09-25 Lisanne van Gelderen , Cristian Tejedor-García

Semi-Supervised Diseased Detection from Speech Dialogues with Multi-Level Data Modeling

Detecting medical conditions from speech acoustics is fundamentally a weakly-supervised learning problem: a single, often noisy, session-level label must be linked to nuanced patterns within a long, complex audio recording. This task is…

Sound · Computer Science 2026-04-21 Xingyuan Li , Mengyue Wu

Developing vocal system impaired patient-aimed voice quality assessment approach using ASR representation-included multiple features

The potential of deep learning in clinical speech processing is immense, yet the hurdles of limited and imbalanced clinical data samples loom large. This article addresses these challenges by showcasing the utilization of automatic speech…

Sound · Computer Science 2024-08-23 Shaoxiang Dang , Tetsuya Matsumoto , Yoshinori Takeuchi , Takashi Tsuboi , Yasuhiro Tanaka , Daisuke Nakatsubo , Satoshi Maesawa , Ryuta Saito , Masahisa Katsuno , Hiroaki Kudo

Impact of Speech Mode in Automatic Pathological Speech Detection

Automatic pathological speech detection approaches yield promising results in identifying various pathologies. These approaches are typically designed and evaluated for phonetically-controlled speech scenarios, where speakers are prompted…

Machine Learning · Computer Science 2024-06-17 Shakeel A. Sheikh , Ina Kodrasi

Overview of Automatic Speech Analysis and Technologies for Neurodegenerative Disorders: Diagnosis and Assistive Applications

Advancements in spoken language technologies for neurodegenerative speech disorders are crucial for meeting both clinical and technological needs. This overview paper is vital for advancing the field, as it presents a comprehensive review…

Audio and Speech Processing · Electrical Eng. & Systems 2025-08-08 Shakeel A. Sheikh , Md. Sahidullah , Ina Kodrasi

Automatic Detection of Speech Sound Disorder in Child Speech Using Posterior-based Speaker Representations

This paper presents a macroscopic approach to automatic detection of speech sound disorder (SSD) in child speech. Typically, SSD is manifested by persistent articulation and phonological errors on specific phonemes in the language. The…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-30 Si-Ioi Ng , Cymie Wing-Yee Ng , Jiarui Wang , Tan Lee

Multi-class Detection of Pathological Speech with Latent Features: How does it perform on unseen data?

The detection of pathologies from speech features is usually defined as a binary classification task with one class representing a specific pathology and the other class representing healthy speech. In this work, we train neural networks,…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-02 Dominik Wagner , Ilja Baumann , Franziska Braun , Sebastian P. Bayerl , Elmar Nöth , Korbinian Riedhammer , Tobias Bocklet

Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases

Automatic classification of disordered speech can provide an objective tool for identifying the presence and severity of speech impairment. Classification approaches can also help identify hard-to-recognize speech samples to teach ASR…

Audio and Speech Processing · Electrical Eng. & Systems 2021-07-09 Subhashini Venugopalan , Joel Shor , Manoj Plakal , Jimmy Tobin , Katrin Tomanek , Jordan R. Green , Michael P. Brenner

Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction

Speech signals are inherently complex as they encompass both global acoustic characteristics and local semantic information. However, in the task of target speech extraction, certain elements of global and local semantic information in the…

Sound · Computer Science 2024-08-27 Zhaoxi Mu , Xinyu Yang , Sining Sun , Qing Yang

Early Recognition of Parkinson's Disease Through Acoustic Analysis and Machine Learning

Parkinson's Disease (PD) is a progressive neurodegenerative disorder that significantly impacts both motor and non-motor functions, including speech. Early and accurate recognition of PD through speech analysis can greatly enhance patient…

Numerical Analysis · Mathematics 2024-07-24 Niloofar Fadavi , Nazanin Fadavi

Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring

Speech fluency/disfluency can be evaluated by analyzing a range of phonetic and prosodic features. Deep neural networks are commonly trained to map fluency-related features into the human scores. However, the effectiveness of deep…

Computation and Language · Computer Science 2023-05-22 Kaiqi Fu , Shaojun Gao , Shuju Shi , Xiaohai Tian , Wei Li , Zejun Ma

Automatic Screening for Children with Speech Disorder using Automatic Speech Recognition: Opportunities and Challenges

Speech is a fundamental aspect of human life, crucial not only for communication but also for cognitive, social, and academic development. Children with speech disorders (SD) face significant challenges that, if unaddressed, can result in…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-17 Dancheng Liu , Jason Yang , Ishan Albrecht-Buehler , Helen Qin , Sophie Li , Yuting Hu , Amir Nassereldine , Jinjun Xiong

The effect of speech pathology on automatic speaker verification -- a large-scale study

Navigating the challenges of data-driven speech processing, one of the primary hurdles is accessing reliable pathological speech data. While public datasets appear to offer solutions, they come with inherent risks of potential unintended…

Sound · Computer Science 2023-11-23 Soroosh Tayebi Arasteh , Tobias Weise , Maria Schuster , Elmar Noeth , Andreas Maier , Seung Hee Yang

Automatic Detection of Phonological Errors in Child Speech Using Siamese Recurrent Autoencoder

Speech sound disorder (SSD) refers to the developmental disorder in which children encounter persistent difficulties in correctly pronouncing words. Assessment of SSD has been relying largely on trained speech and language pathologists…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-10 Si-Ioi Ng , Tan Lee

A Large-Scale Probing Analysis of Speaker-Specific Attributes in Self-Supervised Speech Representations

Enhancing explainability in speech self-supervised learning (SSL) is important for developing reliable SSL-based speech processing systems. This study probes how speech SSL models encode speaker-specific information via a large-scale…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-06 Aemon Yat Fei Chiu , Kei Ching Fung , Roger Tsz Yeung Li , Jingyu Li , Tan Lee