English
Related papers

Related papers: Exploring Efficient-tuning Methods in Self-supervi…

200 papers

Self-supervised learning (SSL) is a powerful technique for learning representations from unlabeled data. Transformer based models such as HuBERT, which consist a feature extractor and transformer layers, are leading the field in the speech…

Audio and Speech Processing · Electrical Eng. & Systems 2023-01-23 Zih-Ching Chen , Yu-Shun Sung , Hung-yi Lee

The utilization of speech Self-Supervised Learning (SSL) models achieves impressive performance on Automatic Speech Recognition (ASR). However, in low-resource language ASR, they encounter the domain mismatch problem between pre-trained and…

With excellent generalization ability, self-supervised speech models have shown impressive performance on various downstream speech tasks in the pre-training and fine-tuning paradigm. However, as the growing size of pre-trained models,…

Audio and Speech Processing · Electrical Eng. & Systems 2024-03-04 Mufan Sang , John H. L. Hansen

We present a method for transferring pre-trained self-supervised (SSL) speech representations to multiple languages. There is an abundance of unannotated speech, so creating self-supervised representations from raw audio and fine-tuning on…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-08 Samuel Kessler , Bethan Thomas , Salah Karout

Self-supervised learning (SSL) is a powerful tool that allows learning of underlying representations from unlabeled data. Transformer based models such as wav2vec 2.0 and HuBERT are leading the field in the speech domain. Generally these…

Computation and Language · Computer Science 2022-02-08 Bethan Thomas , Samuel Kessler , Salah Karout

Fine-tuning of self-supervised models is a powerful transfer learning method in a variety of fields, including speech processing, since it can utilize generic feature representations obtained from large amounts of unlabeled data.…

Multimedia · Computer Science 2022-12-07 Shinta Otake , Rei Kawakami , Nakamasa Inoue

Recent years have witnessed a boom in self-supervised learning (SSL) in various areas including speech processing. Speech based SSL models present promising performance in a range of speech related tasks. However, the training of SSL models…

Audio and Speech Processing · Electrical Eng. & Systems 2023-02-21 Xie Chen , Ziyang Ma , Changli Tang , Yujin Wang , Zhisheng Zheng

Adapter modules were recently introduced as an efficient alternative to fine-tuning in NLP. Adapter tuning consists in freezing pretrained parameters of a model and injecting lightweight modules between layers, resulting in the addition of…

Computation and Language · Computer Science 2021-07-14 Hang Le , Juan Pino , Changhan Wang , Jiatao Gu , Didier Schwab , Laurent Besacier

Self-supervised learning has emerged as a key approach for learning generic representations from speech data. Despite promising results in downstream tasks such as speech recognition, speaker verification, and emotion recognition, a…

Computation and Language · Computer Science 2024-08-01 Nakamasa Inoue , Shinta Otake , Takumi Hirose , Masanari Ohi , Rei Kawakami

Self-supervised learning (SSL) has allowed substantial progress in Automatic Speech Recognition (ASR) performance in low-resource settings. In this context, it has been demonstrated that larger self-supervised feature extractors are crucial…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-14 Salah Zaiem , Robin Algayres , Titouan Parcollet , Slim Essid , Mirco Ravanelli

Self-supervised learning (SSL) has proven vital in speech and audio-related applications. The paradigm trains a general model on unlabeled data that can later be used to solve specific downstream tasks. This type of model is costly to train…

Fine-tuning is a popular method for adapting text-to-speech (TTS) models to new speakers. However this approach has some challenges. Usually fine-tuning requires several hours of high quality speech per speaker. There is also that…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-02 Cheng-Ping Hsieh , Subhankar Ghosh , Boris Ginsburg

Self-Supervised Learning (SSL) has gained traction for its ability to learn rich representations with low labeling costs, applicable across diverse downstream tasks. However, assessing the downstream-task performance remains challenging due…

Sound · Computer Science 2025-10-07 Takashi Maekaku , Keita Goto , Jinchuan Tian , Yusuke Shinohara , Shinji Watanabe

Speech representations learned from Self-supervised learning (SSL) models can benefit various speech processing tasks. However, utilizing SSL representations usually requires fine-tuning the pre-trained models or designing task-specific…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-12 Kai-Wei Chang , Wei-Cheng Tseng , Shang-Wen Li , Hung-yi Lee

Self-supervised learning models have revolutionized the field of speech processing. However, the process of fine-tuning these models on downstream tasks requires substantial computational resources, particularly when dealing with multiple…

Computation and Language · Computer Science 2024-06-24 Varsha Suresh , Salah Aït-Mokhtar , Caroline Brun , Ioan Calapodescu

Self-supervised learning (SSL) methods which learn representations of data without explicit supervision have gained popularity in speech-processing tasks, particularly for single-talker applications. However, these models often have…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-02 Zili Huang , Desh Raj , Paola García , Sanjeev Khudanpur

With excellent generalization ability, SSL speech models have shown impressive performance on various downstream tasks in the pre-training and fine-tuning paradigm. However, as the size of pre-trained models grows, fine-tuning becomes…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-29 Mufan Sang , John H. L. Hansen

Recently, the pre-trained Transformer models have received a rising interest in the field of speech processing thanks to their great success in various downstream tasks. However, most fine-tuning approaches update all the parameters of the…

Audio and Speech Processing · Electrical Eng. & Systems 2022-10-31 Junyi Peng , Themos Stafylakis , Rongzhi Gu , Oldřich Plchot , Ladislav Mošner , Lukáš Burget , Jan Černocký

Adapter-based tuning has recently arisen as an alternative to fine-tuning. It works by adding light-weight adapter modules to a pretrained language model (PrLM) and only updating the parameters of adapter modules when learning on a…

Computation and Language · Computer Science 2021-06-08 Ruidan He , Linlin Liu , Hai Ye , Qingyu Tan , Bosheng Ding , Liying Cheng , Jia-Wei Low , Lidong Bing , Luo Si

Self-supervised learning (SSL) based models have been shown to generate powerful representations that can be used to improve the performance of downstream speech tasks. Several state-of-the-art SSL models are available, and each of these…

Computation and Language · Computer Science 2023-02-21 A Arunkumar , Vrunda N Sukhadia , S. Umesh
‹ Prev 1 2 3 10 Next ›