Related papers: Improving DNN-based Music Source Separation using …

Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation

Deep neural network based methods have been successfully applied to music source separation. They typically learn a mapping from a mixture spectrogram to a set of source spectrograms, all with magnitudes only. This approach has several…

Sound · Computer Science 2021-09-14 Qiuqiang Kong , Yin Cao , Haohe Liu , Keunwoo Choi , Yuxuan Wang

Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)

Music source separation (MSS) aims to extract 'vocals', 'drums', 'bass' and 'other' tracks from a piece of mixed music. While deep learning methods have shown impressive results, there is a trend toward larger models. In our paper, we…

Audio and Speech Processing · Electrical Eng. & Systems 2024-03-20 Junyu Chen , Susmitha Vekkot , Pancham Shukla

Discriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks

The sources separated by most single channel audio source separation techniques are usually distorted and each separated source contains residual signals from the other sources. To tackle this problem, we propose to enhance the separated…

Sound · Computer Science 2016-12-21 Emad M. Grais , Gerard Roma , Andrew J. R. Simpson , Mark D. Plumbley

Music Source Separation with Band-split RNN

The performance of music source separation (MSS) models has been greatly improved in recent years thanks to the development of novel neural network architectures and training pipelines. However, recent model designs for MSS were mainly…

Audio and Speech Processing · Electrical Eng. & Systems 2022-10-03 Yi Luo , Jianwei Yu

Multi-scale temporal-frequency attention for music source separation

In recent years, deep neural networks (DNNs) based approaches have achieved the start-of-the-art performance for music source separation (MSS). Although previous methods have addressed the large receptive field modeling using various…

Audio and Speech Processing · Electrical Eng. & Systems 2022-09-05 Lianwu Chen , Xiguang Zheng , Chen Zhang , Liang Guo , Bing Yu

Harmonic-Percussive Source Separation with Deep Neural Networks and Phase Recovery

Harmonic/percussive source separation (HPSS) consists in separating the pitched instruments from the percussive parts in a music mixture. In this paper, we propose to apply the recently introduced Masker-Denoiser with twin networks (MaD…

Sound · Computer Science 2018-07-31 Konstantinos Drossos , Paul Magron , Stylianos Ioannis Mimilakis , Tuomas Virtanen

D3Net: Densely connected multidilated DenseNet for music source separation

Music source separation involves a large input field to model a long-term dependence of an audio signal. Previous convolutional neural network (CNN)-based approaches address the large input field modeling using sequentially down- and…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-30 Naoya Takahashi , Yuki Mitsufuji

Spectrogram Feature Losses for Music Source Separation

In this paper we study deep learning-based music source separation, and explore using an alternative loss to the standard spectrogram pixel-level L2 loss for model training. Our main contribution is in demonstrating that adding a high-level…

Sound · Computer Science 2019-06-28 Abhimanyu Sahai , Romann Weber , Brian McWilliams

Learned complex masks for multi-instrument source separation

Music source separation in the time-frequency domain is commonly achieved by applying a soft or binary mask to the magnitude component of (complex) spectrograms. The phase component is usually not estimated, but instead copied from the…

Sound · Computer Science 2021-03-25 Andreas Jansson , Rachel M. Bittner , Nicola Montecchio , Tillman Weyde

Deep neural networks for single channel source separation

In this paper, a novel approach for single channel source separation (SCSS) using a deep neural network (DNN) architecture is introduced. Unlike previous studies in which DNN and other classifiers were used for classifying time-frequency…

Neural and Evolutionary Computing · Computer Science 2013-11-13 Emad M. Grais , Mehmet Umut Sen , Hakan Erdogan

A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation

This paper describes a hands-on comparison on using state-of-the-art music source separation deep neural networks (DNNs) before and after task-specific fine-tuning for separating speech content from non-speech content in broadcast audio…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-23 Martin Strauss , Jouni Paulus , Matteo Torcoli , Bernd Edler

Phase reconstruction based on recurrent phase unwrapping with deep neural networks

Phase reconstruction, which estimates phase from a given amplitude spectrogram, is an active research field in acoustical signal processing with many applications including audio synthesis. To take advantage of rich knowledge from data,…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-17 Yoshiki Masuyama , Kohei Yatabe , Yuma Koizumi , Yasuhiro Oikawa , Noboru Harada

Audio Source Separation via Multi-Scale Learning with Dilated Dense U-Nets

Modern audio source separation techniques rely on optimizing sequence model architectures such as, 1D-CNNs, on mixture recordings to generalize well to unseen mixtures. Specifically, recent focus is on time-domain based architectures such…

Machine Learning · Computer Science 2019-04-09 Vivek Sivaraman Narayanaswamy , Sameeksha Katoch , Jayaraman J. Thiagarajan , Huan Song , Andreas Spanias

Source Separation and Depthwise Separable Convolutions for Computer Audition

Given recent advances in deep music source separation, we propose a feature representation method that combines source separation with a state-of-the-art representation learning technique that is suitably repurposed for computer audition…

Sound · Computer Science 2020-12-08 Gabriel Mersy , Jin Hong Kuan

Model-based STFT phase recovery for audio source separation

For audio source separation applications, it is common to estimate the magnitude of the short-time Fourier transform (STFT) of each source. In order to further synthesizing time-domain signals, it is necessary to recover the phase of the…

Sound · Computer Science 2018-02-28 Paul Magron , Roland Badeau , Bertrand David

Deep Transform: Cocktail Party Source Separation via Complex Convolution in a Deep Neural Network

Convolutional deep neural networks (DNN) are state of the art in many engineering problems but have not yet addressed the issue of how to deal with complex spectrograms. Here, we use circular statistics to provide a convenient probabilistic…

Sound · Computer Science 2015-04-14 Andrew J. R. Simpson

Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation

Recent approaches for music source separation are almost exclusively based on deep neural networks, mostly employing recurrent neural networks (RNNs). Although RNNs are in many cases superior than other types of deep neural networks for…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-08 Pyry Pyykkönen , Styliannos I. Mimilakis , Konstantinos Drossos , Tuomas Virtanen

Does Phase Matter For Monaural Source Separation?

The "cocktail party" problem of fully separating multiple sources from a single channel audio waveform remains unsolved. Current biological understanding of neural encoding suggests that phase information is preserved and utilized at every…

Sound · Computer Science 2017-11-06 Mohit Dubey , Garrett Kenyon , Nils Carlson , Austin Thresher

Improving Music Source Separation with Diffusion and Consistency Refinement

In this work, we propose an approach to music source separation that uses a generative diffusion model as a last-stage refinement on top of a deterministic separator, progressively enhancing the separated sources through iterative…

Sound · Computer Science 2026-04-28 Tornike Karchkhadze , Mohammad Rasool Izadi , Shuo Zhang , Shlomo Dubnov

Deep Remix: Remixing Musical Mixtures Using a Convolutional Deep Neural Network

Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then…

Sound · Computer Science 2015-05-05 Andrew J. R Simpson , Gerard Roma , Mark D. Plumbley