English
Related papers

Related papers: Refining Self-Supervised Learnt Speech Representat…

200 papers

Large, pre-trained representation models trained using self-supervised learning have gained popularity in various fields of machine learning because they are able to extract high-quality salient features from input data. As such, they have…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-16 Hejung Yang , Hong-Goo Kang

Recently proposed self-supervised learning approaches have been successful for pre-training speech representation models. The utility of these learned representations has been observed empirically, but not much has been studied about the…

Computation and Language · Computer Science 2022-12-06 Ankita Pasad , Ju-Chieh Chou , Karen Livescu

Pretrained self-supervised speech models excel in speech tasks but do not reflect the hierarchy of human speech processing, as they encode rich semantics in middle layers and poor semantics in late layers. Recent work showed that…

Computation and Language · Computer Science 2025-06-05 Omer Moussa , Mariya Toneva

Speech language models align with human brain responses to natural language to an impressive degree. However, current models rely heavily on low-level speech features, indicating they lack brain-relevant semantics which limits their utility…

Computation and Language · Computer Science 2025-03-05 Omer Moussa , Dietrich Klakow , Mariya Toneva

Over the last decade, numerous studies have shown that deep neural networks exhibit sensory representations similar to those of the mammalian brain, in that their activations linearly map onto cortical responses to the same sensory inputs.…

Neurons and Cognition · Quantitative Biology 2022-02-16 Pierre Orhan , Yves Boubenec , Jean-Rémi King

Several deep neural networks have recently been shown to generate activations similar to those of the brain in response to the same input. These algorithms, however, remain largely implausible: they require (1) extraordinarily large amounts…

Representation learning from unlabeled data has been of major interest in artificial intelligence research. While self-supervised speech representation learning has been popular in the speech research community, very few works have…

Progress in natural language processing (NLP) models that estimate representations of word sequences has recently been leveraged to improve the understanding of language processing in the brain. However, these models have not been…

Neurons and Cognition · Quantitative Biology 2019-11-11 Dan Schwartz , Mariya Toneva , Leila Wehbe

Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM have been shown to significantly improve many speech tasks. However, their large memory and strong computational requirements hinder their industrial applicability.…

Audio and Speech Processing · Electrical Eng. & Systems 2025-05-08 Haoyu Wang , Siyuan Wang , Wei-Qiang Zhang , Hongbin Suo , Yulong Wan

Self-supervised speech representation learning has recently been a prosperous research topic. Many algorithms have been proposed for learning useful representations from large-scale unlabeled data, and their applications to a wide range of…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-03 Yu-An Chung , Yonatan Belinkov , James Glass

Self-supervised learning methods such as wav2vec 2.0 have shown promising results in learning speech representations from unlabelled and untranscribed speech data that are useful for speech recognition. Since these representations are…

Audio and Speech Processing · Electrical Eng. & Systems 2022-03-22 Shehzeen Hussain , Van Nguyen , Shuhua Zhang , Erik Visser

Speech and language models trained through self-supervised learning (SSL) demonstrate strong alignment with brain activity during speech and language perception. However, given their distinct training modalities, it remains unclear whether…

Neurons and Cognition · Quantitative Biology 2024-02-01 Peili Chen , Linyang He , Li Fu , Lu Fan , Edward F. Chang , Yuanning Li

Self-supervised learning models have revolutionized the field of speech processing. However, the process of fine-tuning these models on downstream tasks requires substantial computational resources, particularly when dealing with multiple…

Computation and Language · Computer Science 2024-06-24 Varsha Suresh , Salah Aït-Mokhtar , Caroline Brun , Ioan Calapodescu

Pre-trained speech Transformers have facilitated great success across various speech processing tasks. However, fine-tuning these encoders for downstream tasks require sufficiently large training data to converge or to achieve…

Computation and Language · Computer Science 2022-10-25 Hao Yang , Jinming Zhao , Gholamreza Haffari , Ehsan Shareghi

Self-supervised learning (SSL) foundation models have emerged as powerful, domain-agnostic, general-purpose feature extractors applicable to a wide range of tasks. Such models pre-trained on human speech have demonstrated high…

Machine Learning · Computer Science 2025-01-22 Eklavya Sarkar , Mathew Magimai. -Doss

Self-supervised pre-training could effectively improve the performance of low-resource automatic speech recognition (ASR). However, existing self-supervised pre-training are task-agnostic, i.e., could be applied to various downstream tasks.…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-20 Han Zhu , Li Wang , Jindong Wang , Gaofeng Cheng , Pengyuan Zhang , Yonghong Yan

We present a method for transferring pre-trained self-supervised (SSL) speech representations to multiple languages. There is an abundance of unannotated speech, so creating self-supervised representations from raw audio and fine-tuning on…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-08 Samuel Kessler , Bethan Thomas , Salah Karout

Self-supervised language models are very effective at predicting high-level cortical responses during language comprehension. However, the best current models of lower-level auditory processing in the human brain rely on either…

Computation and Language · Computer Science 2022-05-31 Aditya R. Vaidya , Shailee Jain , Alexander G. Huth

Self-supervised speech models such as wav2vec2.0 and WavLM have been shown to significantly improve the performance of many downstream speech tasks, especially in low-resource settings, over the past few years. Despite this, evaluations on…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-18 Séverin Baroudi , Hervé Bredin , Joseph Razik , Ricard Marxer

Self-supervised pretraining on speech data has achieved a lot of progress. High-fidelity representation of the speech signal is learned from a lot of untranscribed data and shows promising performance. Recently, there are several works…

‹ Prev 1 2 3 10 Next ›