Related papers: Music Source Separation with Generative Flow

Unsupervised Music Source Separation Using Differentiable Parametric Source Models

Supervised deep learning approaches to underdetermined audio source separation achieve state-of-the-art performance but require a dataset of mixtures along with their corresponding isolated source signals. Such datasets can be extremely…

Sound · Computer Science 2023-02-01 Kilian Schulze-Forster , Gaël Richard , Liam Kelley , Clement S. J. Doire , Roland Badeau

Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction

The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data…

Machine Learning · Computer Science 2018-04-09 Daniel Stoller , Sebastian Ewert , Simon Dixon

Generating Separated Singing Vocals Using a Diffusion Model Conditioned on Music Mixtures

Separating the individual elements in a musical mixture is an essential process for music analysis and practice. While this is generally addressed using neural networks optimized to mask or transform the time-frequency representation of a…

Sound · Computer Science 2025-11-27 Genís Plaja-Roglans , Yun-Ning Hung , Xavier Serra , Igor Pereira

Efficient and Fast Generative-Based Singing Voice Separation using a Latent Diffusion Model

Extracting individual elements from music mixtures is a valuable tool for music production and practice. While neural networks optimized to mask or transform mixture spectrograms into the individual source(s) have been the leading approach,…

Sound · Computer Science 2025-11-26 Genís Plaja-Roglans , Yun-Ning Hung , Xavier Serra , Igor Pereira

Unsupervised Audio Source Separation using Generative Priors

State-of-the-art under-determined audio source separation systems rely on supervised end-end training of carefully tailored neural network architectures operating either in the time or the spectral domain. However, these methods are…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-29 Vivek Narayanaswamy , Jayaraman J. Thiagarajan , Rushil Anirudh , Andreas Spanias

Source Separation by Flow Matching

We consider the problem of single-channel audio source separation with the goal of reconstructing $K$ sources from their mixture. We address this ill-posed problem with FLOSS (FLOw matching for Source Separation), a constrained generation…

Sound · Computer Science 2025-07-21 Robin Scheibler , John R. Hershey , Arnaud Doucet , Henry Li

GASS: Generalizing Audio Source Separation with Large-scale Data

Universal source separation targets at separating the audio sources of an arbitrary mix, removing the constraint to operate on a specific domain like speech or music. Yet, the potential of universal source separation is limited because most…

Sound · Computer Science 2023-10-03 Jordi Pons , Xiaoyu Liu , Santiago Pascual , Joan Serrà

Conditioned Source Separation for Music Instrument Performances

In music source separation, the number of sources may vary for each piece and some of the sources may belong to the same family of instruments, thus sharing timbral characteristics and making the sources more correlated. This leads to…

Sound · Computer Science 2021-07-09 Olga Slizovskaia , Gloria Haro , Emilia Gómez

Unsupervised Source Separation By Steering Pretrained Music Models

We showcase an unsupervised method that repurposes deep models trained for music generation and music tagging for audio source separation, without any retraining. An audio generation model is conditioned on an input mixture, producing a…

Sound · Computer Science 2021-10-26 Ethan Manilow , Patrick O'Reilly , Prem Seetharaman , Bryan Pardo

A fully differentiable model for unsupervised singing voice separation

A novel model was recently proposed by Schulze-Forster et al. in [1] for unsupervised music source separation. This model allows to tackle some of the major shortcomings of existing source separation frameworks. Specifically, it eliminates…

Signal Processing · Electrical Eng. & Systems 2024-01-31 Gael Richard , Pierre Chouteau , Bernardo Torres

Self-Supervised Music Source Separation Using Vector-Quantized Source Category Estimates

Music source separation is focused on extracting distinct sonic elements from composite tracks. Historically, many methods have been grounded in supervised learning, necessitating labeled data, which is occasionally constrained in its…

Sound · Computer Science 2023-11-23 Marco Pasini , Stefan Lattner , George Fazekas

Training-Free Multi-Step Audio Source Separation

Audio source separation aims to separate a mixture into target sources. Previous audio source separation systems usually conduct one-step inference, which does not fully explore the separation ability of models. In this work, we reveal that…

Sound · Computer Science 2025-05-27 Yongyi Zang , Jingyi Li , Qiuqiang Kong

Unsupervised Single-Channel Audio Separation with Diffusion Source Priors

Single-channel audio separation aims to separate individual sources from a single-channel mixture. Most existing methods rely on supervised learning with synthetically generated paired data. However, obtaining high-quality paired data in…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-24 Runwu Shi , Chang Li , Jiang Wang , Rui Zhang , Nabeela Khan , Benjamin Yen , Takeshi Ashizawa , Kazuhiro Nakadai

Unsupervised Sound Separation Using Mixture Invariant Training

In recent years, rapid progress has been made on the problem of single-channel sound separation using supervised training of deep neural networks. In such supervised approaches, a model is trained to predict the component sources from…

Audio and Speech Processing · Electrical Eng. & Systems 2020-10-27 Scott Wisdom , Efthymios Tzinis , Hakan Erdogan , Ron J. Weiss , Kevin Wilson , John R. Hershey

Multi-Source Diffusion Models for Simultaneous Music Generation and Separation

In this work, we define a diffusion-based generative model capable of both music synthesis and source separation by learning the score of the joint probability density of sources sharing a context. Alongside the classic total inference…

Sound · Computer Science 2024-03-19 Giorgio Mariani , Irene Tallini , Emilian Postolache , Michele Mancusi , Luca Cosmo , Emanuele Rodolà

Zero-Shot Duet Singing Voices Separation with Diffusion Models

In recent studies, diffusion models have shown promise as priors for solving audio inverse problems. These models allow us to sample from the posterior distribution of a target signal given an observed signal by manipulating the diffusion…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-22 Chin-Yun Yu , Emilian Postolache , Emanuele Rodolà , György Fazekas

Pre-training Music Classification Models via Music Source Separation

In this paper, we study whether music source separation can be used as a pre-training strategy for music representation learning, targeted at music classification tasks. To this end, we first pre-train U-Net networks under various music…

Audio and Speech Processing · Electrical Eng. & Systems 2024-04-24 Christos Garoufis , Athanasia Zlatintsi , Petros Maragos

Why does music source separation benefit from cacophony?

In music source separation, a standard training data augmentation procedure is to create new training samples by randomly combining instrument stems from different songs. These random mixes have mismatched characteristics compared to real…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-29 Chang-Bin Jeon , Gordon Wichern , François G. Germain , Jonathan Le Roux

A Style Transfer Approach to Source Separation

Training neural networks for source separation involves presenting a mixture recording at the input of the network and updating network parameters in order to produce an output that resembles the clean source. Consequently, supervised…

Sound · Computer Science 2019-05-10 Shrikant Venkataramani , Efthymios Tzinis , Paris Smaragdis

Audio Source Separation Using Variational Autoencoders and Weak Class Supervision

In this paper, we propose a source separation method that is trained by observing the mixtures and the class labels of the sources present in the mixture without any access to isolated sources. Since our method does not require source class…

Sound · Computer Science 2019-08-06 Ertuğ Karamatlı , Ali Taylan Cemgil , Serap Kırbız