English
Related papers

Related papers: Unsupervised Audio Source Separation using Generat…

200 papers

State of the art audio source separation models rely on supervised data-driven approaches, which can be expensive in terms of labeling resources. On the other hand, approaches for training these models without any direct supervision are…

Machine Learning · Computer Science 2022-04-04 Michele Mancusi , Emilian Postolache , Giorgio Mariani , Marco Fumero , Andrea Santilli , Luca Cosmo , Emanuele Rodolà

We showcase an unsupervised method that repurposes deep models trained for music generation and music tagging for audio source separation, without any retraining. An audio generation model is conditioned on an input mixture, producing a…

Sound · Computer Science 2021-10-26 Ethan Manilow , Patrick O'Reilly , Prem Seetharaman , Bryan Pardo

Supervised deep learning approaches to underdetermined audio source separation achieve state-of-the-art performance but require a dataset of mixtures along with their corresponding isolated source signals. Such datasets can be extremely…

Single-channel audio separation aims to separate individual sources from a single-channel mixture. Most existing methods rely on supervised learning with synthetically generated paired data. However, obtaining high-quality paired data in…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-24 Runwu Shi , Chang Li , Jiang Wang , Rui Zhang , Nabeela Khan , Benjamin Yen , Takeshi Ashizawa , Kazuhiro Nakadai

The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data…

Machine Learning · Computer Science 2018-04-09 Daniel Stoller , Sebastian Ewert , Simon Dixon

Fully-supervised models for source separation are trained on parallel mixture-source data and are currently state-of-the-art. However, such parallel data is often difficult to obtain, and it is cumbersome to adapt trained models to mixtures…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-30 Ge Zhu , Jordan Darefsky , Fei Jiang , Anton Selitskiy , Zhiyao Duan

Current generative models are able to generate high-quality artefacts but have been shown to struggle with compositional reasoning, which can be defined as the ability to generate complex structures from simpler elements. In this paper, we…

Machine Learning · Computer Science 2024-08-20 Giovanni Bindi , Philippe Esling

Despite substantial progress in signal source separation, results for richly structured data continue to contain perceptible artifacts. In contrast, recent deep generative models can produce authentic samples in a variety of domains that…

Machine Learning · Computer Science 2020-09-22 Vivek Jayaram , John Thickstun

Gaussian process (GP) audio source separation is a time-domain approach that circumvents the inherent phase approximation issue of spectrogram based methods. Furthermore, through its kernel, GPs elegantly incorporate prior knowledge about…

Audio and Speech Processing · Electrical Eng. & Systems 2018-11-22 Pablo A. Alvarado , Mauricio A. Álvarez , Dan Stowell

Unsupervised deep learning methods for solving audio restoration problems extensively rely on carefully tailored neural architectures that carry strong inductive biases for defining priors in the time or spectral domain. In this context,…

The majority of deep learning-based speech enhancement methods require paired clean-noisy speech data. Collecting such data at scale in real-world conditions is infeasible, which has led the community to rely on synthetically generated…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-30 Dominik Klement , Matthew Maciejewski , Sanjeev Khudanpur , Jan Černocký , Lukáš Burget

Deep learning approaches have recently achieved impressive performance on both audio source separation and sound classification. Most audio source separation approaches focus only on separating sources belonging to a restricted domain of…

Sound · Computer Science 2021-05-14 Efthymios Tzinis , Scott Wisdom , John R. Hershey , Aren Jansen , Daniel P. W. Ellis

Separating audio mixtures into individual instrument tracks has been a long standing challenging task. We introduce a novel weakly supervised audio source separation approach based on deep adversarial learning. Specifically, our loss…

Sound · Computer Science 2018-05-18 Ning Zhang , Junchi Yan , Yuchen Zhou

Speech separation is a fundamental task in audio processing, typically addressed with fully supervised systems trained on paired mixtures. While effective, such systems typically rely on synthetic data pipelines, which may not reflect…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-30 Runwu Shi , Kai Li , Chang Li , Jiang Wang , Sihan Tan , Kazuhiro Nakadai

Real-world sound scenes consist of time-varying collections of sound sources, each generating characteristic sound events that are mixed together in audio recordings. The association of these constituent sound events with their mixture and…

While there has been much recent progress using deep learning techniques to separate speech and music audio signals, these systems typically require large collections of isolated sources during the training process. When extending audio…

Sound · Computer Science 2020-09-01 Fatemeh Pishdadian , Gordon Wichern , Jonathan Le Roux

We propose a new method for separating superimposed sources using diffusion-based generative models. Our method relies only on separately trained statistical priors of independent sources to establish a new objective function guided by…

Machine Learning · Computer Science 2024-01-18 Tejas Jayashankar , Gary C. F. Lee , Alejandro Lancho , Amir Weiss , Yury Polyanskiy , Gregory W. Wornell

Extracting individual elements from music mixtures is a valuable tool for music production and practice. While neural networks optimized to mask or transform mixture spectrograms into the individual source(s) have been the leading approach,…

Sound · Computer Science 2025-11-26 Genís Plaja-Roglans , Yun-Ning Hung , Xavier Serra , Igor Pereira

Audio source separation is fundamental for machines to understand complex acoustic environments and underpins numerous audio applications. Current supervised deep learning approaches, while powerful, are limited by the need for extensive,…

Recently, audio-visual separation approaches have taken advantage of the natural synchronization between the two modalities to boost audio source separation performance. They extracted high-level semantics from visual inputs as the guidance…

Sound · Computer Science 2024-07-08 Shentong Mo , Yapeng Tian
‹ Prev 1 2 3 10 Next ›