Related papers: Efficiency-oriented approaches for self-supervised…

Sustainable self-supervised learning for speech representations

Sustainable artificial intelligence focuses on data, hardware, and algorithms to make machine learning models more environmentally responsible. In particular, machine learning models for speech representations are computationally expensive,…

Computation and Language · Computer Science 2024-06-13 Luis Lugo , Valentin Vielzeuf

Self-Supervised Speech Representation Learning: A Review

Although supervised deep learning has revolutionized speech and audio processing, it has necessitated the building of specialist models for individual tasks and application scenarios. It is likewise difficult to apply this to dialects and…

Computation and Language · Computer Science 2022-11-23 Abdelrahman Mohamed , Hung-yi Lee , Lasse Borgholt , Jakob D. Havtorn , Joakim Edin , Christian Igel , Katrin Kirchhoff , Shang-Wen Li , Karen Livescu , Lars Maaløe , Tara N. Sainath , Shinji Watanabe

A Brief Overview of Unsupervised Neural Speech Representation Learning

Unsupervised representation learning for speech processing has matured greatly in the last few years. Work in computer vision and natural language processing has paved the way, but speech data offers unique challenges. As a result, methods…

Audio and Speech Processing · Electrical Eng. & Systems 2022-03-04 Lasse Borgholt , Jakob Drachmann Havtorn , Joakim Edin , Lars Maaløe , Christian Igel

Self-Supervised Representation Learning: Introduction, Advances and Challenges

Self-supervised representation learning methods aim to provide powerful deep feature learning without the requirement of large annotated datasets, thus alleviating the annotation bottleneck that is one of the main barriers to practical…

Machine Learning · Computer Science 2022-05-18 Linus Ericsson , Henry Gouk , Chen Change Loy , Timothy M. Hospedales

Speech representation learning: Learning bidirectional encoders with single-view, multi-view, and multi-task methods

This thesis focuses on representation learning for sequence data over time or space, aiming to improve downstream sequence prediction tasks by using the learned representations. Supervised learning has been the most dominant approach for…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-02 Qingming Tang

Self-supervised speech representation learning has recently been a prosperous research topic. Many algorithms have been proposed for learning useful representations from large-scale unlabeled data, and their applications to a wide range of…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-03 Yu-An Chung , Yonatan Belinkov , James Glass

Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective

Existing studies on self-supervised speech representation learning have focused on developing new training methods and applying pre-trained models for different applications. However, the quality of these models is often measured by the…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-18 Alexander H. Liu , Sung-Lin Yeh , James Glass

Self-supervised Learning for Speech Enhancement

Supervised learning for single-channel speech enhancement requires carefully labeled training examples where the noisy mixture is input into the network and the network is trained to produce an output close to the ideal target. To relax the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-19 Yu-Che Wang , Shrikant Venkataramani , Paris Smaragdis

Recent Advancements in Self-Supervised Paradigms for Visual Feature Representation

We witnessed a massive growth in the supervised learning paradigm in the past decade. Supervised learning requires a large amount of labeled data to reach state-of-the-art performance. However, labeling the samples requires a lot of human…

Computer Vision and Pattern Recognition · Computer Science 2021-11-04 Mrinal Anand , Aditya Garg

Self-supervised representation learning from electroencephalography signals

The supervised learning paradigm is limited by the cost - and sometimes the impracticality - of data collection and labeling in multiple domains. Self-supervised learning, a paradigm which exploits the structure of unlabeled data to create…

Machine Learning · Computer Science 2019-11-14 Hubert Banville , Isabela Albuquerque , Aapo Hyvärinen , Graeme Moffat , Denis-Alexander Engemann , Alexandre Gramfort

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on…

Sound · Computer Science 2018-09-24 Zixing Zhang , Jürgen Geiger , Jouni Pohjalainen , Amr El-Desoky Mousa , Wenyu Jin , Björn Schuller

Revisiting Self-Supervised Visual Representation Learning

Unsupervised visual representation learning remains a largely unsolved problem in computer vision research. Among a big body of recently proposed approaches for unsupervised learning of visual representations, a class of self-supervised…

Computer Vision and Pattern Recognition · Computer Science 2019-01-28 Alexander Kolesnikov , Xiaohua Zhai , Lucas Beyer

Towards Unsupervised Representation Learning: Learning, Evaluating and Transferring Visual Representations

Unsupervised representation learning aims at finding methods that learn representations from data without annotation-based signals. Abstaining from annotations not only leads to economic benefits but may - and to some extent already does -…

Computer Vision and Pattern Recognition · Computer Science 2023-12-04 Bonifaz Stuhr

Self-Supervised Representations Improve End-to-End Speech Translation

End-to-end speech-to-text translation can provide a simpler and smaller system but is facing the challenge of data scarcity. Pre-training methods can leverage unlabeled data and have been shown to be effective on data-scarce settings. In…

Audio and Speech Processing · Electrical Eng. & Systems 2020-10-27 Anne Wu , Changhan Wang , Juan Pino , Jiatao Gu

Efficient Personalized Speech Enhancement through Self-Supervised Learning

This work presents self-supervised learning methods for developing monaural speaker-specific (i.e., personalized) speech enhancement models. While generalist models must broadly address many speakers, specialist models can adapt their…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-28 Aswin Sivaraman , Minje Kim

A Survey on Self-Supervised Representation Learning

Learning meaningful representations is at the heart of many tasks in the field of modern machine learning. Recently, a lot of methods were introduced that allow learning of image representations without supervision. These representations…

Machine Learning · Computer Science 2023-08-23 Tobias Uelwer , Jan Robine , Stefan Sylvius Wagner , Marc Höftmann , Eric Upschulte , Sebastian Konietzny , Maike Behrendt , Stefan Harmeling

Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

Learning good representations without supervision is still an open issue in machine learning, and is particularly challenging for speech signals, which are often characterized by long sequences with a complex hierarchical structure. Some…

Machine Learning · Computer Science 2019-04-09 Santiago Pascual , Mirco Ravanelli , Joan Serrà , Antonio Bonafonte , Yoshua Bengio

Visually Guided Self Supervised Learning of Speech Representations

Self supervised representation learning has recently attracted a lot of research interest for both the audio and visual modalities. However, most works typically focus on a particular modality or feature alone and there has been very…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-21 Abhinav Shukla , Konstantinos Vougioukas , Pingchuan Ma , Stavros Petridis , Maja Pantic

An Unsupervised Autoregressive Model for Speech Representation Learning

This paper proposes a novel unsupervised autoregressive neural model for learning generic speech representations. In contrast to other speech representation learning methods that aim to remove noise or speaker variabilities, ours is…

Computation and Language · Computer Science 2019-06-20 Yu-An Chung , Wei-Ning Hsu , Hao Tang , James Glass

Scaling and Benchmarking Self-Supervised Visual Representation Learning

Self-supervised learning aims to learn representations from the data itself without explicit manual supervision. Existing efforts ignore a crucial aspect of self-supervised learning - the ability to scale to large amount of data because…

Computer Vision and Pattern Recognition · Computer Science 2019-06-07 Priya Goyal , Dhruv Mahajan , Abhinav Gupta , Ishan Misra