English
Related papers

Related papers: Self-Supervised Learning-Based Source Separation f…

200 papers

Neural speech separation has made remarkable progress and its integration with automatic speech recognition (ASR) is an important direction towards realizing multi-speaker ASR. This work provides an insightful investigation of speech…

Self-supervised learning (SSL) methods such as WavLM have shown promising speech separation (SS) results in small-scale simulation-based experiments. In this work, we extend the exploration of the SSL-based SS by massively scaling up both…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-29 Zhuo Chen , Naoyuki Kanda , Jian Wu , Yu Wu , Xiaofei Wang , Takuya Yoshioka , Jinyu Li , Sunit Sivasankaran , Sefik Emre Eskimez

Speech separation has been successfully applied as a frontend processing module of conversation transcription systems thanks to its ability to handle overlapped speech and its flexibility to combine with downstream tasks such as automatic…

Audio and Speech Processing · Electrical Eng. & Systems 2021-07-06 Jian Wu , Zhuo Chen , Sanyuan Chen , Yu Wu , Takuya Yoshioka , Naoyuki Kanda , Shujie Liu , Jinyu Li

Self-supervised learning (SSL) methods which learn representations of data without explicit supervision have gained popularity in speech-processing tasks, particularly for single-talker applications. However, these models often have…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-02 Zili Huang , Desh Raj , Paola García , Sanjeev Khudanpur

Self-supervised learning (SSL) models have achieved considerable improvements in automatic speech recognition (ASR). In addition, ASR performance could be further improved if the model is dedicated to audio content information learning…

Audio and Speech Processing · Electrical Eng. & Systems 2022-12-08 Genshun Wan , Tan Liu , Hang Chen , Jia Pan , Cong Liu , Zhongfu Ye

In recent years, speech-based self-supervised learning (SSL) has made significant progress in various tasks, including automatic speech recognition (ASR). An ASR model with decent performance can be realized by fine-tuning an SSL model with…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-30 Zhisheng Zheng , Ziyang Ma , Yu Wang , Xie Chen

Automatic speech recognition (ASR) has shown rapid advances in recent years but still degrades significantly in far-field and noisy environments. The recent development of self-supervised learning (SSL) technology can improve the ASR…

Sound · Computer Science 2022-05-05 Changfeng Gao , Gaofeng Cheng , Pengyuan Zhang

Transcribing meetings containing overlapped speech with only a single distant microphone (SDM) has been one of the most challenging problems for automatic speech recognition (ASR). While various approaches have been proposed, all previous…

Audio and Speech Processing · Electrical Eng. & Systems 2021-04-14 Naoyuki Kanda , Guoli Ye , Yu Wu , Yashesh Gaur , Xiaofei Wang , Zhong Meng , Zhuo Chen , Takuya Yoshioka

Self-supervised learning (SSL) has allowed substantial progress in Automatic Speech Recognition (ASR) performance in low-resource settings. In this context, it has been demonstrated that larger self-supervised feature extractors are crucial…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-14 Salah Zaiem , Robin Algayres , Titouan Parcollet , Slim Essid , Mirco Ravanelli

The utilization of speech Self-Supervised Learning (SSL) models achieves impressive performance on Automatic Speech Recognition (ASR). However, in low-resource language ASR, they encounter the domain mismatch problem between pre-trained and…

Automatic speech recognition (ASR) models rely on high-quality transcribed data for effective training. Generating pseudo-labels for large unlabeled audio datasets often relies on complex pipelines that combine multiple ASR outputs through…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-06 Jeena Prakash , Blessingh Kumar , Kadri Hacioglu , Bidisha Sharma , Sindhuja Gopalan , Malolan Chetlur , Shankar Venkatesan , Andreas Stolcke

Self-supervised learning (SSL) based speech pre-training has attracted much attention for its capability of extracting rich representations learned from massive unlabeled data. On the other hand, the use of weakly-supervised data is less…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-30 Wangyou Zhang , Yanmin Qian

In real-world applications, automatic speech recognition (ASR) systems must handle overlapping speech from multiple speakers and recognize rare words like technical terms. Traditional methods address multi-talker ASR and contextual biasing…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-17 Jiajun He , Naoki Sawada , Koichi Miyazaki , Tomoki Toda

Real-world sound scenes consist of time-varying collections of sound sources, each generating characteristic sound events that are mixed together in audio recordings. The association of these constituent sound events with their mixture and…

Self-supervised learning (SSL) methods have proven to be very successful in automatic speech recognition (ASR). These great improvements have been reported mostly based on highly curated datasets such as LibriSpeech for non-streaming…

Sound · Computer Science 2022-05-19 Mostafa Karimi , Changliang Liu , Kenichi Kumatani , Yao Qian , Tianyu Wu , Jian Wu

Self-supervised learning (SSL)-based speech models are extensively used for full-stack speech processing. However, it has been observed that improving SSL-based speech representations using unlabeled speech for content-related tasks is…

Computation and Language · Computer Science 2024-06-14 Amit Meghanani , Thomas Hain

Pre-trained models, especially self-supervised learning (SSL) models, have demonstrated impressive results in automatic speech recognition (ASR) task. While most applications of SSL models focus on leveraging continuous representations as…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-03 Zehan Li , Yan Yang , Xueqing Li , Jian Kang , Xiao-Lei Zhang , Jie Li

Recent advancements in Self-Supervised Learning (SSL) have shown promising results in Speaker Verification (SV). However, narrowing the performance gap with supervised systems remains an ongoing challenge. Several studies have observed that…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-25 Victor Miara , Theo Lepage , Reda Dehak

Automatic speech recognition (ASR) of overlapped speech remains a highly challenging task to date. To this end, multi-channel microphone array data are widely used in state-of-the-art ASR systems. Motivated by the invariance of visual…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-19 Jianwei Yu , Bo Wu , Rongzhi Gu , Shi-Xiong Zhang , Lianwu Chen , Yong Xu. Meng Yu , Dan Su , Dong Yu , Xunying Liu , Helen Meng

Most approaches to multi-talker overlapped speech separation and recognition assume that the number of simultaneously active speakers is given, but in realistic situations, it is typically unknown. To cope with this, we extend an iterative…

Audio and Speech Processing · Electrical Eng. & Systems 2020-12-22 Thilo von Neumann , Christoph Boeddeker , Lukas Drude , Keisuke Kinoshita , Marc Delcroix , Tomohiro Nakatani , Reinhold Haeb-Umbach
‹ Prev 1 2 3 10 Next ›