Related papers: Binaural Angular Separation Network

Inference-Adaptive Neural Steering for Real-Time Area-Based Sound Source Separation

We propose a novel Neural Steering technique that adapts the target area of a spatial-aware multi-microphone sound source separation algorithm during inference without the necessity of retraining the deep neural network (DNN). To achieve…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-23 Martin Strauss , Wolfgang Mack , María Luis Valero , Okan Köpüklü

Efficient Area-based and Speaker-Agnostic Source Separation

This paper introduces an area-based source separation method designed for virtual meeting scenarios. The aim is to preserve speech signals from an unspecified number of sources within a defined spatial area in front of a linear microphone…

Audio and Speech Processing · Electrical Eng. & Systems 2024-08-20 Martin Strauss , Okan Köpüklü

Learning to Separate Voices by Spatial Regions

We consider the problem of audio voice separation for binaural applications, such as earphones and hearing aids. While today's neural networks perform remarkably well (separating $4+$ sources with 2 microphones) they assume a known or fixed…

Sound · Computer Science 2022-07-18 Zhongweiyang Xu , Romit Roy Choudhury

Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments

Real-time single-channel speech separation aims to unmix an audio stream captured from a single microphone that contains multiple people talking at once, environmental noise, and reverberation into multiple de-reverberated and noise-free…

Audio and Speech Processing · Electrical Eng. & Systems 2023-04-18 Julian Neri , Sebastian Braun

Neural Speech Separation Using Spatially Distributed Microphones

This paper proposes a neural network based speech separation method using spatially distributed microphones. Unlike with traditional microphone array settings, neither the number of microphones nor their spatial arrangement is known in…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-01 Dongmei Wang , Zhuo Chen , Takuya Yoshioka

Two-stage model and optimal SI-SNR for monaural multi-speaker speech separation in noisy environment

In daily listening environments, speech is always distorted by background noise, room reverberation and interference speakers. With the developing of deep learning approaches, much progress has been performed on monaural multi-speaker…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-04 Chao Ma , Dongmei Li , Xupeng Jia

Exploring Efficient Directional and Distance Cues for Regional Speech Separation

In this paper, we introduce a neural network-based method for regional speech separation using a microphone array. This approach leverages novel spatial cues to extract the sound source not only from specified direction but also within…

Sound · Computer Science 2025-08-12 Yiheng Jiang , Haoxu Wang , Yafeng Chen , Gang Qiao , Biao Tian

Distributed speech separation in spatially unconstrained microphone arrays

Speech separation with several speakers is a challenging task because of the non-stationarity of the speech and the strong signal similarity between interferent sources. Current state-of-the-art solutions can separate well the different…

Signal Processing · Electrical Eng. & Systems 2021-02-09 Nicolas Furnon , Romain Serizel , Irina Illina , Slim Essid

Short-time deep-learning based source separation for speech enhancement in reverberant environments with beamforming

The source separation-based speech enhancement problem with multiple beamforming in reverberant indoor environments is addressed in this paper. We propose that more generic solutions should cope with time-varying dynamic scenarios with…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-05 Alejandro Díaz , Diego Pincheira , Rodrigo Mahu , Nestor Becerra Yoma

Real-time binaural speech separation with preserved spatial cues

Deep learning speech separation algorithms have achieved great success in improving the quality and intelligibility of separated speech from mixed audio. Most previous methods focused on generating a single-channel output for each of the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-18 Cong Han , Yi Luo , Nima Mesgarani

The Cone of Silence: Speech Separation by Localization

Given a multi-microphone recording of an unknown number of speakers talking concurrently, we simultaneously localize the sources and separate the individual speakers. At the core of our method is a deep network, in the waveform domain,…

Sound · Computer Science 2020-10-14 Teerapat Jenrungrot , Vivek Jayaram , Steve Seitz , Ira Kemelmacher-Shlizerman

Speaker-independent Speech Separation with Deep Attractor Network

Despite the recent success of deep learning for many speech processing tasks, single-microphone, speaker-independent speech separation remains challenging for two main reasons. The first reason is the arbitrary order of the target and…

Sound · Computer Science 2018-04-19 Yi Luo , Zhuo Chen , Nima Mesgarani

Deep attractor network for single-microphone speaker separation

Despite the overwhelming success of deep learning in various speech processing tasks, the problem of separating simultaneous speakers in a mixture remains challenging. Two major difficulties in such systems are the arbitrary source…

Sound · Computer Science 2017-11-30 Zhuo Chen , Yi Luo , Nima Mesgarani

Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation

Target speech separation refers to extracting the target speaker's speech from mixed signals. Despite the recent advances in deep learning based close-talk speech separation, the applications to real-world are still an open issue. Two main…

Sound · Computer Science 2020-01-03 Rongzhi Gu , Yuexian Zou

Dual-path Transformer Based Neural Beamformer for Target Speech Extraction

Neural beamformers, which integrate both pre-separation and beamforming modules, have demonstrated impressive effectiveness in target speech extraction. Nevertheless, the performance of these beamformers is inherently limited by the…

Sound · Computer Science 2023-09-08 Aoqi Guo , Sichong Qian , Baoxiang Li , Dazhi Gao

Real-time Speech Enhancement and Separation with a Unified Deep Neural Network for Single/Dual Talker Scenarios

This paper introduces a practical approach for leveraging a real-time deep learning model to alternate between speech enhancement and joint speech enhancement and separation depending on whether the input mixture contains one or two active…

Audio and Speech Processing · Electrical Eng. & Systems 2023-10-17 Kashyap Patel , Anton Kovalyov , Issa Panahi

Multi-Microphone Speaker Separation by Spatial Regions

We consider the task of region-based source separation of reverberant multi-microphone recordings. We assume pre-defined spatial regions with a single active source per region. The objective is to estimate the signals from the individual…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-14 Julian Wechsler , Srikanth Raj Chetupalli , Wolfgang Mack , Emanuël A. P. Habets

Robust Target Speaker Diarization and Separation via Augmented Speaker Embedding Sampling

Traditional speech separation and speaker diarization approaches rely on prior knowledge of target speakers or a predetermined number of participants in audio signals. To address these limitations, recent advances focus on developing…

Sound · Computer Science 2025-08-11 Md Asif Jalal , Luca Remaggi , Vasileios Moschopoulos , Thanasis Kotsiopoulos , Vandana Rajan , Karthikeyan Saravanan , Anastasis Drosou , Junho Heo , Hyuk Oh , Seokyeong Jeong

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation

We propose multi-microphone complex spectral mapping, a simple way of applying deep learning for time-varying non-linear beamforming, for speaker separation in reverberant conditions. We aim at both speaker separation and dereverberation.…

Sound · Computer Science 2021-05-25 Zhong-Qiu Wang , Peidong Wang , DeLiang Wang

Speaker Diarization: Using Recurrent Neural Networks

Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should state when speaker starts and ends. In this project, we analyze given audio file with 2 channels and 2…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-11 Vishal Sharma , Zekun Zhang , Zachary Neubert , Curtis Dyreson