Related papers: Block-Online Guided Source Separation

GPU-accelerated Guided Source Separation for Meeting Transcription

Guided source separation (GSS) is a type of target-speaker extraction method that relies on pre-computed speaker activities and blind source separation to perform front-end enhancement of overlapped speech signals. It was first proposed…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-15 Desh Raj , Daniel Povey , Sanjeev Khudanpur

Neural Blind Source Separation and Diarization for Distant Speech Recognition

This paper presents a neural method for distant speech recognition (DSR) that jointly separates and diarizes speech mixtures without supervision by isolated signals. A standard separation method for multi-talker DSR is a statistical…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-13 Yoshiaki Bando , Tomohiko Nakamura , Shinji Watanabe

All-neural online source separation, counting, and diarization for meeting analysis

Automatic meeting analysis comprises the tasks of speaker counting, speaker diarization, and the separation of overlapped speech, followed by automatic speech recognition. This all has to be carried out on arbitrarily long sessions and,…

Audio and Speech Processing · Electrical Eng. & Systems 2019-02-22 Thilo von Neumann , Keisuke Kinoshita , Marc Delcroix , Shoko Araki , Tomohiro Nakatani , Reinhold Haeb-Umbach

Conversational Speech Separation: an Evaluation Study for Streaming Applications

Continuous speech separation (CSS) is a recently proposed framework which aims at separating each speaker from an input mixture signal in a streaming fashion. Hereafter we perform an evaluation study on practical design considerations for a…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-01 Giovanni Morrone , Samuele Cornell , Enrico Zovato , Alessio Brutti , Stefano Squartini

A general framework for online audio source separation

We consider the problem of online audio source separation. Existing algorithms adopt either a sliding block approach or a stochastic gradient approach, which is faster but less accurate. Also, they rely either on spatial cues or on spectral…

Sound · Computer Science 2011-12-30 Laurent S. R. Simon , Emmanuel Vincent

Constrained Online Recursive Source Separation Framework for Real-time Electrophysiological Signal Processing

Background and Objective: Processing electrophysiological signals often requires blind source separation (BSS) due to the nature of mixing source signals. However, its complex computational demands make real-time BSS challenging. The…

Human-Computer Interaction · Computer Science 2024-11-28 Yao Li , Haowen Zhao , Yunfei Liu , Xu Zhang

Separation Guided Speaker Diarization in Realistic Mismatched Conditions

We propose a separation guided speaker diarization (SGSD) approach by fully utilizing a complementarity of speech separation and speaker clustering. Since the conventional clustering-based speaker diarization (CSD) approach cannot well…

Audio and Speech Processing · Electrical Eng. & Systems 2021-07-07 Shu-Tong Niu , Jun Du , Lei Sun , Chin-Hui Lee

TISDiSS: A Training-Time and Inference-Time Scalable Framework for Discriminative Source Separation

Source separation is a fundamental task in speech, music, and audio processing, and it also provides cleaner and larger data for training generative models. However, improving separation performance in practice often depends on increasingly…

Sound · Computer Science 2025-10-15 Yongsheng Feng , Yuetonghui Xu , Jiehui Luo , Hongjia Liu , Xiaobing Li , Feng Yu , Wei Li

Efficient Area-based and Speaker-Agnostic Source Separation

This paper introduces an area-based source separation method designed for virtual meeting scenarios. The aim is to preserve speech signals from an unspecified number of sources within a defined spatial area in front of a linear microphone…

Audio and Speech Processing · Electrical Eng. & Systems 2024-08-20 Martin Strauss , Okan Köpüklü

Bitwise Source Separation on Hashed Spectra: An Efficient Posterior Estimation Scheme Using Partial Rank Order Metrics

This paper proposes an efficient bitwise solution to the single-channel source separation task. Most dictionary-based source separation algorithms rely on iterative update rules during the run time, which becomes computationally costly…

Sound · Computer Science 2017-12-04 Lijiang Guo , Minje Kim

Distributed speech separation in spatially unconstrained microphone arrays

Speech separation with several speakers is a challenging task because of the non-stationarity of the speech and the strong signal similarity between interferent sources. Current state-of-the-art solutions can separate well the different…

Signal Processing · Electrical Eng. & Systems 2021-02-09 Nicolas Furnon , Romain Serizel , Irina Illina , Slim Essid

Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments

This study emphasizes the significance of exploring distance-based source separation (DSS) in outdoor environments. Unlike existing studies that primarily focus on indoor settings, the proposed model is designed to capture the unique…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-07 Hanbin Bae , Byungjun Kang , Jiwon Kim , Jaeyong Hwang , Hosang Sung , Hoon-Young Cho

Low-Latency Speech Separation Guided Diarization for Telephone Conversations

In this paper, we carry out an analysis on the use of speech separation guided diarization (SSGD) in telephone conversations. SSGD performs diarization by separating the speakers signals and then applying voice activity detection on each…

Audio and Speech Processing · Electrical Eng. & Systems 2022-10-28 Giovanni Morrone , Samuele Cornell , Desh Raj , Luca Serafini , Enrico Zovato , Alessio Brutti , Stefano Squartini

Online speaker diarization of meetings guided by speech separation

Overlapped speech is notoriously problematic for speaker diarization systems. Consequently, the use of speech separation has recently been proposed to improve their performance. Although promising, speech separation models struggle with…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-02 Elio Gruttadauria , Mathieu Fontaine , Slim Essid

End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations

Recent works show that speech separation guided diarization (SSGD) is an increasingly promising direction, mainly thanks to the recent progress in speech separation. It performs diarization by first separating the speakers and then applying…

Audio and Speech Processing · Electrical Eng. & Systems 2024-05-24 Giovanni Morrone , Samuele Cornell , Luca Serafini , Enrico Zovato , Alessio Brutti , Stefano Squartini

SLOGD: Speaker LOcation Guided Deflation approach to speech separation

Speech separation is the process of separating multiple speakers from an audio recording. In this work we propose to separate the sources using a Speaker LOcalization Guided Deflation (SLOGD) approach wherein we estimate the sources…

Audio and Speech Processing · Electrical Eng. & Systems 2019-10-25 Sunit Sivasankaran , Emmanuel Vincent , Dominique Fohr

Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates

This work addresses the problem of block-online processing for multi-channel speech enhancement. Such processing is vital in scenarios with moving speakers and/or when very short utterances are processed, e.g., in voice assistant scenarios.…

Sound · Computer Science 2020-05-27 Jiri Malek , Zbynek Koldovsky , Marek Bohac

Simultaneous Diarization and Separation of Meetings through the Integration of Statistical Mixture Models

We propose an approach for simultaneous diarization and separation of meeting data. It consists of a complex Angular Central Gaussian Mixture Model (cACGMM) for speech source separation, and a von-Mises-Fisher Mixture Model (VMFMM) for…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-25 Tobias Cord-Landwehr , Christoph Boeddeker , Reinhold Haeb-Umbach

Informed Group-Sparse Representation for Singing Voice Separation

Singing voice separation attempts to separate the vocal and instrumental parts of a music recording, which is a fundamental problem in music information retrieval. Recent work on singing voice separation has shown that the low-rank…

Audio and Speech Processing · Electrical Eng. & Systems 2018-01-12 Tak-Shing T. Chan , Yi-Hsuan Yang

Online Spectrogram Inversion for Low-Latency Audio Source Separation

Audio source separation is usually achieved by estimating the short-time Fourier transform (STFT) magnitude of each source, and then applying a spectrogram inversion algorithm to retrieve time-domain signals. In particular, the multiple…

Sound · Computer Science 2020-04-22 Paul Magron , Tuomas Virtanen