Related papers: Sampling-Frequency-Independent Audio Source Separa…

Sampling Frequency Independent Dialogue Separation

In some DNNs for audio source separation, the relevant model parameters are independent of the sampling frequency of the audio used for training. Considering the application of dialogue separation, this is shown for two DNN architectures: a…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-07 Jouni Paulus , Matteo Torcoli

Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders

Supervised multi-channel audio source separation requires extracting useful spectral, temporal, and spatial features from the mixed signals. The success of many existing systems is therefore largely dependent on the choice of features used…

Sound · Computer Science 2018-03-05 Emad M. Grais , Dominic Ward , Mark D. Plumbley

Unsupervised Music Source Separation Using Differentiable Parametric Source Models

Supervised deep learning approaches to underdetermined audio source separation achieve state-of-the-art performance but require a dataset of mixtures along with their corresponding isolated source signals. Such datasets can be extremely…

Sound · Computer Science 2023-02-01 Kilian Schulze-Forster , Gaël Richard , Liam Kelley , Clement S. J. Doire , Roland Badeau

Deep Remix: Remixing Musical Mixtures Using a Convolutional Deep Neural Network

Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then…

Sound · Computer Science 2015-05-05 Andrew J. R Simpson , Gerard Roma , Mark D. Plumbley

Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation

Speaker-independent speech separation has achieved remarkable performance in recent years with the development of deep neural network (DNN). Various network architectures, from traditional convolutional neural network (CNN) and recurrent…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-17 Xue Yang , Changchun Bao

A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation

This paper describes a hands-on comparison on using state-of-the-art music source separation deep neural networks (DNNs) before and after task-specific fine-tuning for separating speech content from non-speech content in broadcast audio…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-23 Martin Strauss , Jouni Paulus , Matteo Torcoli , Bernd Edler

Raw Waveform-based Audio Classification Using Sample-level CNN Architectures

Music, speech, and acoustic scene sound are often handled separately in the audio domain because of their different signal characteristics. However, as the image domain grows rapidly by versatile image classification models, it is necessary…

Sound · Computer Science 2017-12-05 Jongpil Lee , Taejun Kim , Jiyoung Park , Juhan Nam

Audio Super Resolution using Neural Networks

We introduce a new audio processing technique that increases the sampling rate of signals such as speech or music using deep convolutional neural networks. Our model is trained on pairs of low and high-quality audio examples; at test-time,…

Sound · Computer Science 2017-08-03 Volodymyr Kuleshov , S. Zayd Enam , Stefano Ermon

Unsupervised Single-Channel Audio Separation with Diffusion Source Priors

Single-channel audio separation aims to separate individual sources from a single-channel mixture. Most existing methods rely on supervised learning with synthetically generated paired data. However, obtaining high-quality paired data in…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-24 Runwu Shi , Chang Li , Jiang Wang , Rui Zhang , Nabeela Khan , Benjamin Yen , Takeshi Ashizawa , Kazuhiro Nakadai

Audio Source Separation Using a Deep Autoencoder

This paper proposes a novel framework for unsupervised audio source separation using a deep autoencoder. The characteristics of unknown source signals mixed in the mixed input is automatically by properly configured autoencoders implemented…

Sound · Computer Science 2014-12-24 Giljin Jang , Han-Gyu Kim , Yung-Hwan Oh

Sound Source Separation Using Latent Variational Block-Wise Disentanglement

While neural network approaches have made significant strides in resolving classical signal processing problems, it is often the case that hybrid approaches that draw insight from both signal processing and neural networks produce more…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-13 Karim Helwani , Masahito Togami , Paris Smaragdis , Michael M. Goodwin

Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation

In deep neural networks with convolutional layers, each layer typically has fixed-size/single-resolution receptive field (RF). Convolutional layers with a large RF capture global information from the input features, while layers with small…

Sound · Computer Science 2017-11-01 Emad M. Grais , Hagen Wierstorf , Dominic Ward , Mark D. Plumbley

Independent Deeply Learned Tensor Analysis for Determined Audio Source Separation

We address the determined audio source separation problem in the time-frequency domain. In independent deeply learned matrix analysis (IDLMA), it is assumed that the inter-frequency correlation of each source spectrum is zero, which is…

Sound · Computer Science 2021-06-11 Naoki Narisawa , Rintaro Ikeshita , Norihiro Takamune , Daichi Kitamura , Tomohiko Nakamura , Hiroshi Saruwatari , Tomohiro Nakatani

Source Separation and Depthwise Separable Convolutions for Computer Audition

Given recent advances in deep music source separation, we propose a feature representation method that combines source separation with a state-of-the-art representation learning technique that is suitably repurposed for computer audition…

Sound · Computer Science 2020-12-08 Gabriel Mersy , Jin Hong Kuan

Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation

Recent approaches for music source separation are almost exclusively based on deep neural networks, mostly employing recurrent neural networks (RNNs). Although RNNs are in many cases superior than other types of deep neural networks for…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-08 Pyry Pyykkönen , Styliannos I. Mimilakis , Konstantinos Drossos , Tuomas Virtanen

Multi-scale Multi-band DenseNets for Audio Source Separation

This paper deals with the problem of audio source separation. To handle the complex and ill-posed nature of the problems of audio source separation, the current state-of-the-art approaches employ deep neural networks to obtain instrumental…

Sound · Computer Science 2017-06-30 Naoya Takahashi , Yuki Mitsufuji

Efficient and Fast Generative-Based Singing Voice Separation using a Latent Diffusion Model

Extracting individual elements from music mixtures is a valuable tool for music production and practice. While neural networks optimized to mask or transform mixture spectrograms into the individual source(s) have been the leading approach,…

Sound · Computer Science 2025-11-26 Genís Plaja-Roglans , Yun-Ning Hung , Xavier Serra , Igor Pereira

D3Net: Densely connected multidilated DenseNet for music source separation

Music source separation involves a large input field to model a long-term dependence of an audio signal. Previous convolutional neural network (CNN)-based approaches address the large input field modeling using sequentially down- and…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-30 Naoya Takahashi , Yuki Mitsufuji

Voice and accompaniment separation in music using self-attention convolutional neural network

Music source separation has been a popular topic in signal processing for decades, not only because of its technical difficulty, but also due to its importance to many commercial applications, such as automatic karoake and remixing. In this…

Audio and Speech Processing · Electrical Eng. & Systems 2020-03-23 Yuzhou Liu , Balaji Thoshkahna , Ali Milani , Trausti Kristjansson

Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform

We propose a time-domain audio source separation method using down-sampling (DS) and up-sampling (US) layers based on a discrete wavelet transform (DWT). The proposed method is based on one of the state-of-the-art deep neural networks,…

Sound · Computer Science 2022-12-05 Tomohiko Nakamura , Hiroshi Saruwatari