English
Related papers

Related papers: ArrayDPS: Unsupervised Blind Speech Separation wit…

200 papers

Multi-channel speech enhancement aims to recover clean speech from noisy multi-channel recordings. Most deep learning methods employ discriminative training, which can lead to non-linear distortions from regression-based objectives,…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-26 Zhongweiyang Xu , Ashutosh Pandey , Juan Azcarreta , Zhaoheng Ni , Sanjeel Parekh , Buye Xu

We propose Uni-ArrayDPS, a novel diffusion-based refinement framework for unified multi-channel speech enhancement and separation. Existing methods for multi-channel speech enhancement/separation are mostly discriminative and are highly…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-27 Zhongweiyang Xu , Ashutosh Pandey , Juan Azcarreta , Zhaoheng Ni , Sanjeel Parekh , Buye Xu , Romit Roy Choudhury

We consider the problem of multi-channel single-speaker blind dereverberation, where multi-channel mixtures are used to recover the clean anechoic speech. To solve this problem, we propose USD-DPS, {U}nsupervised {S}peech {D}ereverberation…

Sound · Computer Science 2025-12-02 Yulun Wu , Zhongweiyang Xu , Jianchong Chen , Zhong-Qiu Wang , Romit Roy Choudhury

Speech separation has been shown effective for multi-talker speech recognition. Under the ad hoc microphone array setup where the array consists of spatially distributed asynchronous microphones, additional challenges must be overcome as…

Sound · Computer Science 2021-03-04 Dongmei Wang , Takuya Yoshioka , Zhuo Chen , Xiaofei Wang , Tianyan Zhou , Zhong Meng

Blind speech separation (BSS) aims to recover multiple speech sources from multi-channel, multi-speaker mixtures under unknown array geometry and room impulse responses. In unsupervised setup where clean target speech is not available for…

Sound · Computer Science 2025-10-13 Shulin He , Zhong-Qiu Wang

Speech separation is a fundamental task in audio processing, typically addressed with fully supervised systems trained on paired mixtures. While effective, such systems typically rely on synthetic data pipelines, which may not reflect…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-30 Runwu Shi , Kai Li , Chang Li , Jiang Wang , Sihan Tan , Kazuhiro Nakadai

Speech separation with several speakers is a challenging task because of the non-stationarity of the speech and the strong signal similarity between interferent sources. Current state-of-the-art solutions can separate well the different…

Signal Processing · Electrical Eng. & Systems 2021-02-09 Nicolas Furnon , Romain Serizel , Irina Illina , Slim Essid

Given a time series of multicomponent measurements x(t), the usual objective of nonlinear blind source separation (BSS) is to find a "source" time series s(t), comprised of statistically independent combinations of the measured components.…

Artificial Intelligence · Computer Science 2015-05-13 David N. Levin

We consider the problem of separating speech sources captured by multiple spatially separated devices, each of which has multiple microphones and samples its signals at a slightly different rate. Most asynchronous array processing methods…

Audio and Speech Processing · Electrical Eng. & Systems 2019-12-12 Ryan M. Corey , Andrew C. Singer

Blind source separation (BSS), i.e., the decoupling of unknown signals that have been mixed in an unknown way, has been a topic of great interest in the signal processing community for the last decade, covering a wide range of applications…

Machine Learning · Statistics 2016-03-11 Eleftherios Kofidis

We propose a separation guided speaker diarization (SGSD) approach by fully utilizing a complementarity of speech separation and speaker clustering. Since the conventional clustering-based speaker diarization (CSD) approach cannot well…

Audio and Speech Processing · Electrical Eng. & Systems 2021-07-07 Shu-Tong Niu , Jun Du , Lei Sun , Chin-Hui Lee

This paper addresses the challenge of audio-visual single-microphone speech separation and enhancement in the presence of real-world environmental noise. Our approach is based on generative inverse sampling, where we model clean speech and…

Audio and Speech Processing · Electrical Eng. & Systems 2026-02-03 Yochai Yemini , Yoav Ellinson , Rami Ben-Ari , Sharon Gannot , Ethan Fetaya

The goal of this work is to develop a meeting transcription system that can recognize speech even when utterances of different speakers are overlapped. While speech overlaps have been regarded as a major obstacle in accurately transcribing…

Audio and Speech Processing · Electrical Eng. & Systems 2018-10-10 Takuya Yoshioka , Hakan Erdogan , Zhuo Chen , Xiong Xiao , Fil Alleva

In reverberant conditions with multiple concurrent speakers, each microphone acquires a mixture signal of multiple speakers at a different location. In over-determined conditions where the microphones out-number speakers, we can narrow down…

Sound · Computer Science 2023-10-31 Zhong-Qiu Wang , Shinji Watanabe

This paper presents a complete strategy for the geometry estimation of large microphone arrays of arbitrary shape. Largeness is intended here in both number of microphones (hundreds) and size (few meters). Such arrays can be used for…

Data Analysis, Statistics and Probability · Physics 2016-03-28 Charles Vanwynsberghe , Pascal Challande , Jacques Marchal , Régis Marchiano , François Ollivier

When dealing with overlapped speech, the performance of automatic speech recognition (ASR) systems substantially degrades as they are designed for single-talker speech. To enhance ASR performance in conversational or meeting environments,…

Audio and Speech Processing · Electrical Eng. & Systems 2023-11-16 Hassan Taherian , DeLiang Wang

When we place microphones close to a sound source near other sources in audio recording, the obtained audio signal includes undesired sound from the other sources, which is often called cross-talk or bleeding sound. For many audio…

In this paper, we address the problem of single-microphone speech separation in the presence of ambient noise. We propose a generative unsupervised technique that directly models both clean speech and structured noise components, training…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-19 Yochai Yemini , Rami Ben-Ari , Sharon Gannot , Ethan Fetaya

A class of methods based on multichannel linear prediction (MCLP) can achieve effective blind dereverberation of a source, when the source is observed with a microphone array. We propose an inventive use of MCLP as a pre-processing step for…

Sound · Computer Science 2017-02-28 İlker Bayram , Savaşkan Bulek

The task of blind source separation (BSS) involves separating sources from a mixture without prior knowledge of the sources or the mixing system. Single-channel mixtures and non-linear mixtures are a particularly challenging problem in BSS.…

Signal Processing · Electrical Eng. & Systems 2025-07-24 Matthew B. Webster , Joonnyong Lee
‹ Prev 1 2 3 10 Next ›