Related papers: Multi-Task Self-Supervised Pre-Training for Music …

An Experimental Comparison Of Multi-view Self-supervised Methods For Music Tagging

Self-supervised learning has emerged as a powerful way to pre-train generalizable machine learning models on large amounts of unlabeled data. It is particularly compelling in the music domain, where obtaining labeled data is time-consuming,…

Sound · Computer Science 2024-04-16 Gabriel Meseguer-Brocal , Dorian Desblancs , Romain Hennequin

Supervised and Unsupervised Learning of Audio Representations for Music Understanding

In this work, we provide a broad comparative analysis of strategies for pre-training audio understanding models for several tasks in the music domain, including labelling of genre, era, origin, mood, instrumentation, key, pitch, vocal…

Sound · Computer Science 2022-10-11 Matthew C. McCallum , Filip Korzeniowski , Sergio Oramas , Fabien Gouyon , Andreas F. Ehmann

Label-efficient audio classification through multitask learning and self-supervision

While deep learning has been incredibly successful in modeling tasks with large, carefully curated labeled datasets, its application to problems with limited labeled data remains a challenge. The aim of the present work is to improve the…

Audio and Speech Processing · Electrical Eng. & Systems 2019-10-29 Tyler Lee , Ting Gong , Suchismita Padhy , Andrew Rouditchenko , Anthony Ndirango

Pretext Tasks selection for multitask self-supervised speech representation learning

Through solving pretext tasks, self-supervised learning leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task. In audio/speech signal processing, a wide range of…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-23 Salah Zaiem , Titouan Parcollet , Slim Essid , Abdel Heba

Self-supervised Auxiliary Loss for Metric Learning in Music Similarity-based Retrieval and Auto-tagging

In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Given the limitations and non-scalability of human supervision signals, it becomes crucial for models to learn from…

Sound · Computer Science 2023-04-18 Taketo Akama , Hiroaki Kitano , Katsuhiro Takematsu , Yasushi Miyajima , Natalia Polouliakh

Self-Supervised Learning for Audio-Based Emotion Recognition

Emotion recognition models using audio input data can enable the development of interactive systems with applications in mental healthcare, marketing, gaming, and social media analysis. While the field of affective computing using audio…

Sound · Computer Science 2023-07-25 Peranut Nimitsurachat , Peter Washington

Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks

Representation learning from unlabeled data has been of major interest in artificial intelligence research. While self-supervised speech representation learning has been popular in the speech research community, very few works have…

Sound · Computer Science 2022-01-10 Sangeeta Srivastava , Yun Wang , Andros Tjandra , Anurag Kumar , Chunxi Liu , Kritika Singh , Yatharth Saraf

Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual Broadcasts

Audio and speech self-supervised encoder models are now widely used for a lot of different tasks. Many of these models are often trained on clean segmented speech content such as LibriSpeech. In this paper, we look into how the pretraining…

Audio and Speech Processing · Electrical Eng. & Systems 2026-04-13 Valentin Pelloin , Lina Bekkali , Reda Dehak , David Doukhan

Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer

Supervised learning methods have shown effectiveness in estimating spatial acoustic parameters such as time difference of arrival, direct-to-reverberant ratio and reverberation time. However, they still suffer from the simulation-to-reality…

Sound · Computer Science 2024-09-10 Bing Yang , Xiaofei Li

Self-supervised Learning for Acoustic Few-Shot Classification

Labelled data are limited and self-supervised learning is one of the most important approaches for reducing labelling requirements. While it has been extensively explored in the image domain, it has so far not received the same amount of…

Sound · Computer Science 2025-05-16 Jingyong Liang , Bernd Meyer , Isaac Ning Lee , Thanh-Toan Do

A Survey of the Impact of Self-Supervised Pretraining for Diagnostic Tasks with Radiological Images

Self-supervised pretraining has been observed to be effective at improving feature representations for transfer learning, leveraging large amounts of unlabelled data. This review summarizes recent research into its usage in X-ray, computed…

Machine Learning · Computer Science 2023-09-07 Blake VanBerlo , Jesse Hoey , Alexander Wong

Semi-Supervised Music Tagging Transformer

We present Music Tagging Transformer that is trained with a semi-supervised approach. The proposed model captures local acoustic characteristics in shallow convolutional layers, then temporally summarizes the sequence of the extracted…

Sound · Computer Science 2021-11-29 Minz Won , Keunwoo Choi , Xavier Serra

Self-Supervised Pretraining on Paired Sequences of fMRI Data for Transfer Learning to Brain Decoding Tasks

In this work we introduce a self-supervised pretraining framework for transformers on functional Magnetic Resonance Imaging (fMRI) data. First, we pretrain our architecture on two self-supervised tasks simultaneously to teach the model a…

Machine Learning · Computer Science 2023-05-17 Sean Paulsen , Michael Casey

Meta-learning of semi-supervised learning from tasks with heterogeneous attribute spaces

We propose a meta-learning method for semi-supervised learning that learns from multiple tasks with heterogeneous attribute spaces. The existing semi-supervised meta-learning methods assume that all tasks share the same attribute space,…

Machine Learning · Computer Science 2023-11-10 Tomoharu Iwata , Atsutoshi Kumagai

Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces

We combine multi-task learning and semi-supervised learning by inducing a joint embedding space between disparate label spaces and learning transfer functions between label embeddings, enabling us to jointly leverage unlabelled data and…

Computation and Language · Computer Science 2018-04-10 Isabelle Augenstein , Sebastian Ruder , Anders Søgaard

Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition

Inspite the emerging importance of Speech Emotion Recognition (SER), the state-of-the-art accuracy is quite low and needs improvement to make commercial applications of SER viable. A key underlying reason for the low accuracy is the…

Sound · Computer Science 2020-03-24 Siddique Latif , Rajib Rana , Sara Khalifa , Raja Jurdak , Julien Epps , Björn W. Schuller

Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies

Automatic singing voice understanding tasks, such as singer identification, singing voice transcription, and singing technique classification, benefit from data-driven approaches that utilize deep learning techniques. These approaches work…

Sound · Computer Science 2023-09-06 Yuya Yamamoto

Self-Supervised Learning of Audio Representations from Permutations with Differentiable Ranking

Self-supervised pre-training using so-called "pretext" tasks has recently shown impressive performance across a wide range of modalities. In this work, we advance self-supervised learning from permutations, by pre-training a model to…

Sound · Computer Science 2021-05-05 Andrew N Carr , Quentin Berthet , Mathieu Blondel , Olivier Teboul , Neil Zeghidour

Self-Supervised Learning based Monaural Speech Enhancement with Multi-Task Pre-Training

In self-supervised learning, it is challenging to reduce the gap between the enhancement performance on the estimated and target speech signals with existed pre-tasks. In this paper, we propose a multi-task pre-training method to improve…

Sound · Computer Science 2022-01-02 Yi Li , Yang Sun , Syed Mohsen Naqvi

Deep Learning for Audio Transcription on Low-Resource Datasets

In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for…

Machine Learning · Computer Science 2018-07-12 Veronica Morfi , Dan Stowell