English
Related papers

Related papers: User-guided Generative Source Separation

200 papers

Recent breakthroughs in language-queried audio source separation (LASS) have shown that generative models can achieve higher separation audio quality than traditional masking-based approaches. However, two key limitations restrict their…

We propose DiffSep, a new single channel source separation method based on score-matching of a stochastic differential equation (SDE). We craft a tailored continuous time diffusion-mixing process starting from the separated sources and…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-03 Robin Scheibler , Youna Ji , Soo-Whan Chung , Jaeuk Byun , Soyeon Choe , Min-Seok Choi

In this work, we define a diffusion-based generative model capable of both music synthesis and source separation by learning the score of the joint probability density of sources sharing a context. Alongside the classic total inference…

Music source separation (MSS) is a task that involves isolating individual sound sources, or stems, from mixed audio signals. This paper presents an ensemble approach to MSS, combining several state-of-the-art architectures to achieve…

Sound · Computer Science 2024-10-29 Saarth Vardhan , Pavani R Acharya , Samarth S Rao , Oorjitha Ratna Jasthi , S Natarajan

Multi-Source Diffusion Models (MSDM) allow for compositional musical generation tasks: generating a set of coherent sources, creating accompaniments, and performing source separation. Despite their versatility, they require estimating the…

Sound · Computer Science 2024-03-19 Emilian Postolache , Giorgio Mariani , Luca Cosmo , Emmanouil Benetos , Emanuele Rodolà

Generating multi-instrument music from symbolic music representations is an important task in Music Information Retrieval (MIR). A central but still largely unsolved problem in this context is musically and acoustically informed control in…

Sound · Computer Science 2023-09-22 Ben Maman , Johannes Zeitler , Meinard Müller , Amit H. Bermano

Masked generative models (MGMs) have shown impressive generative ability while providing an order of magnitude efficient sampling steps compared to continuous diffusion models. However, MGMs still underperform in image synthesis compared to…

Computer Vision and Pattern Recognition · Computer Science 2024-10-18 Jiwan Hur , Dong-Jae Lee , Gyojin Han , Jaehyun Choi , Yunho Jeon , Junmo Kim

Separating the individual elements in a musical mixture is an essential process for music analysis and practice. While this is generally addressed using neural networks optimized to mask or transform the time-frequency representation of a…

Sound · Computer Science 2025-11-27 Genís Plaja-Roglans , Yun-Ning Hung , Xavier Serra , Igor Pereira

Most music generation models directly generate a single music mixture. To allow for more flexible and controllable generation, the Multi-Source Diffusion Model (MSDM) has been proposed to model music as a mixture of multiple instrumental…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-18 Zhongweiyang Xu , Debottam Dutta , Yu-Lin Wei , Romit Roy Choudhury

Despite phenomenal progress in recent years, state-of-the-art music separation systems produce source estimates with significant perceptual shortcomings, such as adding extraneous noise or removing harmonics. We propose a post-processing…

Sound · Computer Science 2022-08-29 Noah Schaffer , Boaz Cogan , Ethan Manilow , Max Morrison , Prem Seetharaman , Bryan Pardo

We demonstrate how conditional generation from diffusion models can be used to tackle a variety of realistic tasks in the production of music in 44.1kHz stereo audio with sampling-time guidance. The scenarios we consider include…

Sound · Computer Science 2023-12-06 Mark Levy , Bruno Di Giorgi , Floris Weers , Angelos Katharopoulos , Tom Nickson

In this work, we propose an approach to music source separation that uses a generative diffusion model as a last-stage refinement on top of a deterministic separator, progressively enhancing the separated sources through iterative…

Sound · Computer Science 2026-04-28 Tornike Karchkhadze , Mohammad Rasool Izadi , Shuo Zhang , Shlomo Dubnov

Audio source separation is fundamental for machines to understand complex acoustic environments and underpins numerous audio applications. Current supervised deep learning approaches, while powerful, are limited by the need for extensive,…

We present MGE-LDM, a unified latent diffusion framework for simultaneous music generation, source imputation, and query-driven source separation. Unlike prior approaches constrained to fixed instrument classes, MGE-LDM learns a joint…

Sound · Computer Science 2025-10-21 Yunkee Chae , Kyogu Lee

Fully-supervised models for source separation are trained on parallel mixture-source data and are currently state-of-the-art. However, such parallel data is often difficult to obtain, and it is cumbersome to adapt trained models to mixtures…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-30 Ge Zhu , Jordan Darefsky , Fei Jiang , Anton Selitskiy , Zhiyao Duan

Extracting individual elements from music mixtures is a valuable tool for music production and practice. While neural networks optimized to mask or transform mixture spectrograms into the individual source(s) have been the leading approach,…

Sound · Computer Science 2025-11-26 Genís Plaja-Roglans , Yun-Ning Hung , Xavier Serra , Igor Pereira

Generative models have attracted considerable attention for speech separation tasks, and among these, diffusion-based methods are being explored. Despite the notable success of diffusion techniques in generation tasks, their adaptation to…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-28 Jinwei Dong , Xinsheng Wang , Qirong Mao

Dataset distillation has emerged as an effective strategy, significantly reducing training costs and facilitating more efficient model deployment. Recent advances have leveraged generative models to distill datasets by capturing the…

Computer Vision and Pattern Recognition · Computer Science 2025-05-27 Jeffrey A. Chan-Santiago , Praveen Tirupattur , Gaurav Kumar Nayak , Gaowen Liu , Mubarak Shah

Universal source separation targets at separating the audio sources of an arbitrary mix, removing the constraint to operate on a specific domain like speech or music. Yet, the potential of universal source separation is limited because most…

Sound · Computer Science 2023-10-03 Jordi Pons , Xiaoyu Liu , Santiago Pascual , Joan Serrà

Similar to colorization in computer vision, instrument separation is to assign instrument labels (e.g. piano, guitar...) to notes from unlabeled mixtures which contain only performance information. To address the problem, we adopt diffusion…

Sound · Computer Science 2022-09-08 Sangjun Han , Hyeongrae Ihm , DaeHan Ahn , Woohyung Lim
‹ Prev 1 2 3 10 Next ›