English
Related papers

Related papers: CHAPTER: Exploiting Convolutional Neural Network A…

200 papers

In this study, we aim to explore efficient tuning methods for speech self-supervised learning. Recent studies show that self-supervised learning (SSL) can learn powerful representations for different speech tasks. However, fine-tuning…

Audio and Speech Processing · Electrical Eng. & Systems 2023-01-31 Zih-Ching Chen , Chin-Lun Fu , Chih-Ying Liu , Shang-Wen Li , Hung-yi Lee

Self-supervised learning (SSL) is a powerful tool that allows learning of underlying representations from unlabeled data. Transformer based models such as wav2vec 2.0 and HuBERT are leading the field in the speech domain. Generally these…

Computation and Language · Computer Science 2022-02-08 Bethan Thomas , Samuel Kessler , Salah Karout

In recent years, self-supervised learning (SSL) has achieved tremendous success in various speech tasks due to its power to extract representations from massive unlabeled data. However, compared with tasks such as speech recognition (ASR),…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-14 Tianrui Wang , Xie Chen , Zhuo Chen , Shu Yu , Weibin Zhu

The utilization of speech Self-Supervised Learning (SSL) models achieves impressive performance on Automatic Speech Recognition (ASR). However, in low-resource language ASR, they encounter the domain mismatch problem between pre-trained and…

We present a method for transferring pre-trained self-supervised (SSL) speech representations to multiple languages. There is an abundance of unannotated speech, so creating self-supervised representations from raw audio and fine-tuning on…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-08 Samuel Kessler , Bethan Thomas , Salah Karout

Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs). Transformer models are good at capturing…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-19 Anmol Gulati , James Qin , Chung-Cheng Chiu , Niki Parmar , Yu Zhang , Jiahui Yu , Wei Han , Shibo Wang , Zhengdong Zhang , Yonghui Wu , Ruoming Pang

Self-supervised learning (SSL)-based speech models are extensively used for full-stack speech processing. However, it has been observed that improving SSL-based speech representations using unlabeled speech for content-related tasks is…

Computation and Language · Computer Science 2024-06-14 Amit Meghanani , Thomas Hain

Self-Supervised Learning (SSL) models have been successfully applied in various deep learning-based speech tasks, particularly those with a limited amount of data. However, the quality of SSL representations depends highly on the…

Computation and Language · Computer Science 2022-04-20 Dan Berrebbi , Jiatong Shi , Brian Yan , Osbel Lopez-Francisco , Jonathan D. Amith , Shinji Watanabe

Recent years have witnessed a boom in self-supervised learning (SSL) in various areas including speech processing. Speech based SSL models present promising performance in a range of speech related tasks. However, the training of SSL models…

Audio and Speech Processing · Electrical Eng. & Systems 2023-02-21 Xie Chen , Ziyang Ma , Changli Tang , Yujin Wang , Zhisheng Zheng

Self-supervised learning (SSL) has advanced speech processing. However, existing speech SSL methods typically assume a single sampling rate and struggle with mixed-rate data due to temporal resolution mismatch. To address this limitation,…

Sound · Computer Science 2026-03-25 Zikang Huang , Meng Ge , Tianrui Wang , Xuanchen Li , Xiaobao Wang , Longbiao Wang , Jianwu Dang

With excellent generalization ability, self-supervised speech models have shown impressive performance on various downstream speech tasks in the pre-training and fine-tuning paradigm. However, as the growing size of pre-trained models,…

Audio and Speech Processing · Electrical Eng. & Systems 2024-03-04 Mufan Sang , John H. L. Hansen

Self-supervised learning has emerged as a key approach for learning generic representations from speech data. Despite promising results in downstream tasks such as speech recognition, speaker verification, and emotion recognition, a…

Computation and Language · Computer Science 2024-08-01 Nakamasa Inoue , Shinta Otake , Takumi Hirose , Masanari Ohi , Rei Kawakami

This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features; we extract a speaker representation used for adaptation directly from the test utterance. Conventional studies of deep neural…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-17 Yuma Koizumi , Kohei Yatabe , Marc Delcroix , Yoshiki Masuyama , Daiki Takeuchi

Self-supervised learning (SSL) methods which learn representations of data without explicit supervision have gained popularity in speech-processing tasks, particularly for single-talker applications. However, these models often have…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-02 Zili Huang , Desh Raj , Paola García , Sanjeev Khudanpur

Pre-trained self-supervised learning (SSL) models have achieved remarkable success in various speech tasks. However, their potential in target speech extraction (TSE) has not been fully exploited. TSE aims to extract the speech of a target…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-21 Junyi Peng , Marc Delcroix , Tsubasa Ochiai , Oldrich Plchot , Shoko Araki , Jan Cernocky

Recently, self-supervised learning (SSL) from unlabelled speech data has gained increased attention in the automatic speech recognition (ASR) community. Typical SSL methods include autoregressive predictive coding (APC), Wav2vec2.0, and…

Audio and Speech Processing · Electrical Eng. & Systems 2023-05-02 Ruchao Fan , Yunzheng Zhu , Jinhan Wang , Abeer Alwan

Self-Supervised Learning (SSL) has allowed leveraging large amounts of unlabeled speech data to improve the performance of speech recognition models even with small annotated datasets. Despite this, speech SSL representations may fail while…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-02 Salah Zaiem , Titouan Parcollet , Slim Essid

Self-supervised learning (SSL) is a long-standing goal for speech processing, since it utilizes large-scale unlabeled data and avoids extensive human labeling. Recent years witness great successes in applying self-supervised learning in…

Computation and Language · Computer Science 2021-10-13 Sanyuan Chen , Yu Wu , Chengyi Wang , Zhengyang Chen , Zhuo Chen , Shujie Liu , Jian Wu , Yao Qian , Furu Wei , Jinyu Li , Xiangzhan Yu

Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing. The SSL model is normally pre-trained on a great variety of unlabelled data and a large model size is preferred to increase the modeling…

Audio and Speech Processing · Electrical Eng. & Systems 2025-05-08 Yujin Wang , Changli Tang , Ziyang Ma , Zhisheng Zheng , Xie Chen , Wei-Qiang Zhang

Singing voice beat tracking is a challenging task, due to the lack of musical accompaniment that often contains robust rhythmic and harmonic patterns, something most existing beat tracking systems utilize and can be essential for estimating…

Sound · Computer Science 2025-03-14 Jiajun Deng , Yaolong Ju , Jing Yang , Simon Lui , Xunying Liu
‹ Prev 1 2 3 10 Next ›