Related papers: Compute and memory efficient universal sound sourc…

Sudo rm -rf: Efficient Networks for Universal Audio Source Separation

In this paper, we present an efficient neural network for end-to-end general purpose audio source separation. Specifically, the backbone structure of this convolutional network is the SUccessive DOwnsampling and Resampling of…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-14 Efthymios Tzinis , Zhepei Wang , Paris Smaragdis

MMDenseLSTM: An efficient combination of convolutional and recurrent neural networks for audio source separation

Deep neural networks have become an indispensable technique for audio source separation (ASS). It was recently reported that a variant of CNN architecture called MMDenseNet was successfully employed to solve the ASS problem of estimating…

Sound · Computer Science 2018-05-30 Naoya Takahashi , Nabarun Goswami , Yuki Mitsufuji

Sampling-Frequency-Independent Universal Sound Separation

This paper proposes a universal sound separation (USS) method capable of handling untrained sampling frequencies (SFs). The USS aims at separating arbitrary sources of different types and can be the key technique to realize a source…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-25 Tomohiko Nakamura , Kohei Yatabe

Audio Source Separation via Multi-Scale Learning with Dilated Dense U-Nets

Modern audio source separation techniques rely on optimizing sequence model architectures such as, 1D-CNNs, on mixture recordings to generalize well to unseen mixtures. Specifically, recent focus is on time-domain based architectures such…

Machine Learning · Computer Science 2019-04-09 Vivek Sivaraman Narayanaswamy , Sameeksha Katoch , Jayaraman J. Thiagarajan , Huan Song , Andreas Spanias

On Neural Architectures for Deep Learning-based Source Separation of Co-Channel OFDM Signals

We study the single-channel source separation problem involving orthogonal frequency-division multiplexing (OFDM) signals, which are ubiquitous in many modern-day digital communication systems. Related efforts have been pursued in monaural…

Signal Processing · Electrical Eng. & Systems 2023-06-28 Gary C. F. Lee , Amir Weiss , Alejandro Lancho , Yury Polyanskiy , Gregory W. Wornell

Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation

Models for audio source separation usually operate on the magnitude spectrum, which ignores phase information and makes separation performance dependant on hyper-parameters for the spectral front-end. Therefore, we investigate end-to-end…

Sound · Computer Science 2018-06-11 Daniel Stoller , Sebastian Ewert , Simon Dixon

Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation

In deep neural networks with convolutional layers, each layer typically has fixed-size/single-resolution receptive field (RF). Convolutional layers with a large RF capture global information from the input features, while layers with small…

Sound · Computer Science 2017-11-01 Emad M. Grais , Hagen Wierstorf , Dominic Ward , Mark D. Plumbley

Multi-scale Multi-band DenseNets for Audio Source Separation

This paper deals with the problem of audio source separation. To handle the complex and ill-posed nature of the problems of audio source separation, the current state-of-the-art approaches employ deep neural networks to obtain instrumental…

Sound · Computer Science 2017-06-30 Naoya Takahashi , Yuki Mitsufuji

Source Separation and Depthwise Separable Convolutions for Computer Audition

Given recent advances in deep music source separation, we propose a feature representation method that combines source separation with a state-of-the-art representation learning technique that is suitably repurposed for computer audition…

Sound · Computer Science 2020-12-08 Gabriel Mersy , Jin Hong Kuan

Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation

Speaker-independent speech separation has achieved remarkable performance in recent years with the development of deep neural network (DNN). Various network architectures, from traditional convolutional neural network (CNN) and recurrent…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-17 Xue Yang , Changchun Bao

Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation

Monaural source separation is important for many real world applications. It is challenging because, with only a single channel of information available, without any constraints, an infinite number of solutions are possible. In this paper,…

Sound · Computer Science 2015-10-02 Po-Sen Huang , Minje Kim , Mark Hasegawa-Johnson , Paris Smaragdis

Generalization Challenges for Neural Architectures in Audio Source Separation

Recent work has shown that recurrent neural networks can be trained to separate individual speakers in a sound mixture with high fidelity. Here we explore convolutional neural network models as an alternative and show that they achieve…

Sound · Computer Science 2018-05-29 Shariq Mobin , Brian Cheung , Bruno Olshausen

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Recent advances in the design of neural network architectures, in particular those specialized in modeling sequences, have provided significant improvements in speech separation performance. In this work, we propose to use a bio-inspired…

Sound · Computer Science 2021-12-07 Xiaolin Hu , Kai Li , Weiyi Zhang , Yi Luo , Jean-Marie Lemercier , Timo Gerkmann

Towards efficient models for real-time deep noise suppression

With recent research advancements, deep learning models are becoming attractive and powerful choices for speech enhancement in real-time applications. While state-of-the-art models can achieve outstanding results in terms of speech quality…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-20 Sebastian Braun , Hannes Gamper , Chandan K. A. Reddy , Ivan Tashev

Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders

Supervised multi-channel audio source separation requires extracting useful spectral, temporal, and spatial features from the mixed signals. The success of many existing systems is therefore largely dependent on the choice of features used…

Sound · Computer Science 2018-03-05 Emad M. Grais , Dominic Ward , Mark D. Plumbley

REAL-M: Towards Speech Separation on Real Mixtures

In recent years, deep learning based source separation has achieved impressive results. Most studies, however, still evaluate separation models on synthetic datasets, while the performance of state-of-the-art techniques on in-the-wild…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-22 Cem Subakan , Mirco Ravanelli , Samuele Cornell , François Grondin

Deep Remix: Remixing Musical Mixtures Using a Convolutional Deep Neural Network

Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then…

Sound · Computer Science 2015-05-05 Andrew J. R Simpson , Gerard Roma , Mark D. Plumbley

Learning to Separate Voices by Spatial Regions

We consider the problem of audio voice separation for binaural applications, such as earphones and hearing aids. While today's neural networks perform remarkably well (separating $4+$ sources with 2 microphones) they assume a known or fixed…

Sound · Computer Science 2022-07-18 Zhongweiyang Xu , Romit Roy Choudhury

On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments

This paper introduces a new method for multi-channel time domain speech separation in reverberant environments. A fully-convolutional neural network structure has been used to directly separate speech from multiple microphone recordings,…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-12 Jisi Zhang , Catalin Zorila , Rama Doddipatla , Jon Barker

Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments

Real-time single-channel speech separation aims to unmix an audio stream captured from a single microphone that contains multiple people talking at once, environmental noise, and reverberation into multiple de-reverberated and noise-free…

Audio and Speech Processing · Electrical Eng. & Systems 2023-04-18 Julian Neri , Sebastian Braun