Related papers: ArrayDPS: Unsupervised Blind Speech Separation wit…

ArrayDPS-Refine: Generative Refinement of Discriminative Multi-Channel Speech Enhancement

Multi-channel speech enhancement aims to recover clean speech from noisy multi-channel recordings. Most deep learning methods employ discriminative training, which can lead to non-linear distortions from regression-based objectives,…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-26 Zhongweiyang Xu , Ashutosh Pandey , Juan Azcarreta , Zhaoheng Ni , Sanjeel Parekh , Buye Xu

Unified Diffusion Refinement for Multi-Channel Speech Enhancement and Separation

We propose Uni-ArrayDPS, a novel diffusion-based refinement framework for unified multi-channel speech enhancement and separation. Existing methods for multi-channel speech enhancement/separation are mostly discriminative and are highly…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-27 Zhongweiyang Xu , Ashutosh Pandey , Juan Azcarreta , Zhaoheng Ni , Sanjeel Parekh , Buye Xu , Romit Roy Choudhury

Unsupervised Multi-channel Speech Dereverberation via Diffusion

We consider the problem of multi-channel single-speaker blind dereverberation, where multi-channel mixtures are used to recover the clean anechoic speech. To solve this problem, we propose USD-DPS, {U}nsupervised {S}peech {D}ereverberation…

Sound · Computer Science 2025-12-02 Yulun Wu , Zhongweiyang Xu , Jianchong Chen , Zhong-Qiu Wang , Romit Roy Choudhury

Continuous Speech Separation with Ad Hoc Microphone Arrays

Speech separation has been shown effective for multi-talker speech recognition. Under the ad hoc microphone array setup where the array consists of spatially distributed asynchronous microphones, additional challenges must be overcome as…

Sound · Computer Science 2021-03-04 Dongmei Wang , Takuya Yoshioka , Zhuo Chen , Xiaofei Wang , Tianyan Zhou , Zhong Meng

VM-UNSSOR: Unsupervised Neural Speech Separation Enhanced by Higher-SNR Virtual Microphone Arrays

Blind speech separation (BSS) aims to recover multiple speech sources from multi-channel, multi-speaker mixtures under unknown array geometry and room impulse responses. In unsupervised setup where clean target speech is not available for…

Sound · Computer Science 2025-10-13 Shulin He , Zhong-Qiu Wang

Unsupervised Single-Channel Speech Separation with a Diffusion Prior under Speaker-Embedding Guidance

Speech separation is a fundamental task in audio processing, typically addressed with fully supervised systems trained on paired mixtures. While effective, such systems typically rely on synthetic data pipelines, which may not reflect…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-30 Runwu Shi , Kai Li , Chang Li , Jiang Wang , Sihan Tan , Kazuhiro Nakadai

Distributed speech separation in spatially unconstrained microphone arrays

Speech separation with several speakers is a challenging task because of the non-stationarity of the speech and the strong signal similarity between interferent sources. Current state-of-the-art solutions can separate well the different…

Signal Processing · Electrical Eng. & Systems 2021-02-09 Nicolas Furnon , Romain Serizel , Irina Illina , Slim Essid

Performing Nonlinear Blind Source Separation with Signal Invariants

Given a time series of multicomponent measurements x(t), the usual objective of nonlinear blind source separation (BSS) is to find a "source" time series s(t), comprised of statistically independent combinations of the measured components.…

Artificial Intelligence · Computer Science 2015-05-13 David N. Levin

Speech Separation Using Partially Asynchronous Microphone Arrays Without Resampling

We consider the problem of separating speech sources captured by multiple spatially separated devices, each of which has multiple microphones and samples its signals at a slightly different rate. Most asynchronous array processing methods…

Audio and Speech Processing · Electrical Eng. & Systems 2019-12-12 Ryan M. Corey , Andrew C. Singer

Blind Source Separation: Fundamentals and Recent Advances (A Tutorial Overview Presented at SBrT-2001)

Blind source separation (BSS), i.e., the decoupling of unknown signals that have been mixed in an unknown way, has been a topic of great interest in the signal processing community for the last decade, covering a wide range of applications…

Machine Learning · Statistics 2016-03-11 Eleftherios Kofidis

Separation Guided Speaker Diarization in Realistic Mismatched Conditions

We propose a separation guided speaker diarization (SGSD) approach by fully utilizing a complementarity of speech separation and speaker clustering. Since the conventional clustering-based speaker diarization (CSD) approach cannot well…

Audio and Speech Processing · Electrical Eng. & Systems 2021-07-07 Shu-Tong Niu , Jun Du , Lei Sun , Chin-Hui Lee

SSNAPS: Audio-Visual Separation of Speech and Background Noise with Diffusion Inverse Sampling

This paper addresses the challenge of audio-visual single-microphone speech separation and enhancement in the presence of real-world environmental noise. Our approach is based on generative inverse sampling, where we model clean speech and…

Audio and Speech Processing · Electrical Eng. & Systems 2026-02-03 Yochai Yemini , Yoav Ellinson , Rami Ben-Ari , Sharon Gannot , Ethan Fetaya

Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks

The goal of this work is to develop a meeting transcription system that can recognize speech even when utterances of different speakers are overlapped. While speech overlaps have been regarded as a major obstacle in accurately transcribing…

Audio and Speech Processing · Electrical Eng. & Systems 2018-10-10 Takuya Yoshioka , Hakan Erdogan , Zhuo Chen , Xiong Xiao , Fil Alleva

UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures

In reverberant conditions with multiple concurrent speakers, each microphone acquires a mixture signal of multiple speakers at a different location. In over-determined conditions where the microphones out-number speakers, we can narrow down…

Sound · Computer Science 2023-10-31 Zhong-Qiu Wang , Shinji Watanabe

A robust and passive method for geometric calibration of large arrays

This paper presents a complete strategy for the geometry estimation of large microphone arrays of arbitrary shape. Largeness is intended here in both number of microphones (hundreds) and size (few meters). Such arrays can be used for…

Data Analysis, Statistics and Probability · Physics 2016-03-28 Charles Vanwynsberghe , Pascal Challande , Jacques Marchal , Régis Marchiano , François Ollivier

Multi-channel Conversational Speaker Separation via Neural Diarization

When dealing with overlapped speech, the performance of automatic speech recognition (ASR) systems substantially degrades as they are designed for single-talker speech. To enhance ASR performance in conversational or meeting environments,…

Audio and Speech Processing · Electrical Eng. & Systems 2023-11-16 Hassan Taherian , DeLiang Wang

Prior Distribution Design for Music Bleeding-Sound Reduction Based on Nonnegative Matrix Factorization

When we place microphones close to a sound source near other sources in audio recording, the obtained audio signal includes undesired sound from the other sources, which is often called cross-talk or bleeding sound. For many audio…

Sound · Computer Science 2021-09-02 Yusaku Mizobuchi , Daichi Kitamura , Tomohiko Nakamura , Hiroshi Saruwatari , Yu Takahashi , Kazunobu Kondo

Diffusion-Based Unsupervised Audio-Visual Speech Separation in Noisy Environments with Noise Prior

In this paper, we address the problem of single-microphone speech separation in the presence of ambient noise. We propose a generative unsupervised technique that directly models both clean speech and structured noise components, training…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-19 Yochai Yemini , Rami Ben-Ari , Sharon Gannot , Ethan Fetaya

Multichannel Linear Prediction for Blind Reverberant Audio Source Separation

A class of methods based on multichannel linear prediction (MCLP) can achieve effective blind dereverberation of a source, when the source is observed with a microphone array. We propose an inventive use of MCLP as a pre-processing step for…

Sound · Computer Science 2017-02-28 İlker Bayram , Savaşkan Bulek

Blind Source Separation of Single-Channel Mixtures via Multi-Encoder Autoencoders

The task of blind source separation (BSS) involves separating sources from a mixture without prior knowledge of the sources or the mixing system. Single-channel mixtures and non-linear mixtures are a particularly challenging problem in BSS.…

Signal Processing · Electrical Eng. & Systems 2025-07-24 Matthew B. Webster , Joonnyong Lee