Related papers: Unsupervised Music Source Separation Using Differe…

Music Source Separation with Generative Flow

Fully-supervised models for source separation are trained on parallel mixture-source data and are currently state-of-the-art. However, such parallel data is often difficult to obtain, and it is cumbersome to adapt trained models to mixtures…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-30 Ge Zhu , Jordan Darefsky , Fei Jiang , Anton Selitskiy , Zhiyao Duan

Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction

The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data…

Machine Learning · Computer Science 2018-04-09 Daniel Stoller , Sebastian Ewert , Simon Dixon

Blind Source Separation in Polyphonic Music Recordings Using Deep Neural Networks Trained via Policy Gradients

We propose a method for the blind separation of sounds of musical instruments in audio signals. We describe the individual tones via a parametric model, training a dictionary to capture the relative amplitudes of the harmonics. The model…

Audio and Speech Processing · Electrical Eng. & Systems 2021-08-10 Sören Schulze , Johannes Leuschner , Emily J. King

Unsupervised Audio Source Separation using Generative Priors

State-of-the-art under-determined audio source separation systems rely on supervised end-end training of carefully tailored neural network architectures operating either in the time or the spectral domain. However, these methods are…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-29 Vivek Narayanaswamy , Jayaraman J. Thiagarajan , Rushil Anirudh , Andreas Spanias

Deep Remix: Remixing Musical Mixtures Using a Convolutional Deep Neural Network

Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then…

Sound · Computer Science 2015-05-05 Andrew J. R Simpson , Gerard Roma , Mark D. Plumbley

Fast accuracy estimation of deep learning based multi-class musical source separation

Music source separation represents the task of extracting all the instruments from a given song. Recent breakthroughs on this challenge have gravitated around a single dataset, MUSDB, only limited to four instrument classes. Larger datasets…

Sound · Computer Science 2021-12-02 Alexandru Mocanu , Benjamin Ricaud , Milos Cernak

Training-Free Multi-Step Audio Source Separation

Audio source separation aims to separate a mixture into target sources. Previous audio source separation systems usually conduct one-step inference, which does not fully explore the separation ability of models. In this work, we reveal that…

Sound · Computer Science 2025-05-27 Yongyi Zang , Jingyi Li , Qiuqiang Kong

Interleaved Multitask Learning for Audio Source Separation with Independent Databases

Deep Neural Network-based source separation methods usually train independent models to optimize for the separation of individual sources. Although this can lead to good performance for well-defined targets, it can also be computationally…

Sound · Computer Science 2019-08-15 Clement S. J. Doire , Olumide Okubadejo

Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed

We study the problem of source separation for music using deep learning with four known sources: drums, bass, vocals and other accompaniments. State-of-the-art approaches predict soft masks over mixture spectrograms while methods working on…

Sound · Computer Science 2019-09-04 Alexandre Défossez , Nicolas Usunier , Léon Bottou , Francis Bach

Unsupervised Single-Channel Audio Separation with Diffusion Source Priors

Single-channel audio separation aims to separate individual sources from a single-channel mixture. Most existing methods rely on supervised learning with synthetically generated paired data. However, obtaining high-quality paired data in…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-24 Runwu Shi , Chang Li , Jiang Wang , Rui Zhang , Nabeela Khan , Benjamin Yen , Takeshi Ashizawa , Kazuhiro Nakadai

Unsupervised Sound Separation Using Mixture Invariant Training

In recent years, rapid progress has been made on the problem of single-channel sound separation using supervised training of deep neural networks. In such supervised approaches, a model is trained to predict the component sources from…

Audio and Speech Processing · Electrical Eng. & Systems 2020-10-27 Scott Wisdom , Efthymios Tzinis , Hakan Erdogan , Ron J. Weiss , Kevin Wilson , John R. Hershey

Deep Bayesian Unsupervised Source Separation Based on a Complex Gaussian Mixture Model

This paper presents an unsupervised method that trains neural source separation by using only multichannel mixture signals. Conventional neural separation methods require a lot of supervised data to achieve excellent performance. Although…

Sound · Computer Science 2019-08-30 Yoshiaki Bando , Yoko Sasaki , Kazuyoshi Yoshii

Self-Supervised Music Source Separation Using Vector-Quantized Source Category Estimates

Music source separation is focused on extracting distinct sonic elements from composite tracks. Historically, many methods have been grounded in supervised learning, necessitating labeled data, which is occasionally constrained in its…

Sound · Computer Science 2023-11-23 Marco Pasini , Stefan Lattner , George Fazekas

Finding Strength in Weakness: Learning to Separate Sounds with Weak Supervision

While there has been much recent progress using deep learning techniques to separate speech and music audio signals, these systems typically require large collections of isolated sources during the training process. When extending audio…

Sound · Computer Science 2020-09-01 Fatemeh Pishdadian , Gordon Wichern , Jonathan Le Roux

Source Separation and Depthwise Separable Convolutions for Computer Audition

Given recent advances in deep music source separation, we propose a feature representation method that combines source separation with a state-of-the-art representation learning technique that is suitably repurposed for computer audition…

Sound · Computer Science 2020-12-08 Gabriel Mersy , Jin Hong Kuan

Unsupervised Composable Representations for Audio

Current generative models are able to generate high-quality artefacts but have been shown to struggle with compositional reasoning, which can be defined as the ability to generate complex structures from simpler elements. In this paper, we…

Machine Learning · Computer Science 2024-08-20 Giovanni Bindi , Philippe Esling

Unsupervised Source Separation By Steering Pretrained Music Models

We showcase an unsupervised method that repurposes deep models trained for music generation and music tagging for audio source separation, without any retraining. An audio generation model is conditioned on an input mixture, producing a…

Sound · Computer Science 2021-10-26 Ethan Manilow , Patrick O'Reilly , Prem Seetharaman , Bryan Pardo

A Recurrent Encoder-Decoder Approach with Skip-filtering Connections for Monaural Singing Voice Separation

The objective of deep learning methods based on encoder-decoder architectures for music source separation is to approximate either ideal time-frequency masks or spectral representations of the target music source(s). The spectral…

Sound · Computer Science 2018-04-25 Stylianos Ioannis Mimilakis , Konstantinos Drossos , Tuomas Virtanen , Gerald Schuller

A fully differentiable model for unsupervised singing voice separation

A novel model was recently proposed by Schulze-Forster et al. in [1] for unsupervised music source separation. This model allows to tackle some of the major shortcomings of existing source separation frameworks. Specifically, it eliminates…

Signal Processing · Electrical Eng. & Systems 2024-01-31 Gael Richard , Pierre Chouteau , Bernardo Torres

Unsupervised Single-Channel Speech Separation with a Diffusion Prior under Speaker-Embedding Guidance

Speech separation is a fundamental task in audio processing, typically addressed with fully supervised systems trained on paired mixtures. While effective, such systems typically rely on synthetic data pipelines, which may not reflect…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-30 Runwu Shi , Kai Li , Chang Li , Jiang Wang , Sihan Tan , Kazuhiro Nakadai