English
Related papers

Related papers: Localization Based Sequential Grouping for Continu…

200 papers

In this paper we describe a speaker diarization system that enables localization and identification of all speakers present in a conversation or meeting. We propose a novel systematic approach to tackle several long-standing challenges in…

Sound · Computer Science 2021-07-21 Siqi Zheng , Weilong Huang , Xianliang Wang , Hongbin Suo , Jinwei Feng , Zhijie Yan

Speaker diarization relies on the assumption that speech segments corresponding to a particular speaker are concentrated in a specific region of the speaker space; a region which represents that speaker's identity. These identities are not…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-09 Nikolaos Flemotomos , Panayiotis Georgiou , Shrikanth Narayanan

While there has been substantial amount of work in speaker diarization recently, there are few efforts in jointly employing lexical and acoustic information for speaker segmentation. Towards that, we investigate a speaker diarization system…

Audio and Speech Processing · Electrical Eng. & Systems 2018-05-29 Tae Jin Park , Panayiotis Georgiou

Speaker diarization is the task of answering Who spoke and when? in an audio stream. Pipeline systems rely on speech segmentation to extract speakers' segments and achieve robust speaker diarization. This paper proposes a common framework…

Sound · Computer Science 2023-06-08 Théo Mariotte , Anthony Larcher , Silvio Montrésor , Jean-Hugh Thomas

Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should state when speaker starts and ends. In this project, we analyze given audio file with 2 channels and 2…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-11 Vishal Sharma , Zekun Zhang , Zachary Neubert , Curtis Dyreson

Traditional speech separation and speaker diarization approaches rely on prior knowledge of target speakers or a predetermined number of participants in audio signals. To address these limitations, recent advances focus on developing…

Given a multi-microphone recording of an unknown number of speakers talking concurrently, we simultaneously localize the sources and separate the individual speakers. At the core of our method is a deep network, in the waveform domain,…

Sound · Computer Science 2020-10-14 Teerapat Jenrungrot , Vivek Jayaram , Steve Seitz , Ira Kemelmacher-Shlizerman

Speaker diarization is a task to label audio or video recordings with classes that correspond to speaker identity, or in short, a task to identify "who spoke when". In the early years, speaker diarization algorithms were developed for…

Audio and Speech Processing · Electrical Eng. & Systems 2021-11-29 Tae Jin Park , Naoyuki Kanda , Dimitrios Dimitriadis , Kyu J. Han , Shinji Watanabe , Shrikanth Narayanan

Speaker diarization(SD) is a classic task in speech processing and is crucial in multi-party scenarios such as meetings and conversations. Current mainstream speaker diarization approaches consider acoustic information only, which result in…

Computation and Language · Computer Science 2023-05-23 Luyao Cheng , Siqi Zheng , Zhang Qinglin , Hui Wang , Yafeng Chen , Qian Chen

This paper proposes a novel Sequence-to-Sequence Neural Diarization (S2SND) framework to perform online and offline speaker diarization. It is developed from the sequence-to-sequence architecture of our previous target-speaker voice…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-24 Ming Cheng , Yuke Lin , Ming Li

We present an end-to-end deep network model that performs meeting diarization from single-channel audio recordings. End-to-end diarization models have the advantage of handling speaker overlap and enabling straightforward handling of…

Sound · Computer Science 2021-05-06 Soumi Maiti , Hakan Erdogan , Kevin Wilson , Scott Wisdom , Shinji Watanabe , John R. Hershey

Multi-source localization is an important and challenging technique for multi-talker conversation analysis. This paper proposes a novel supervised learning method using deep neural networks to estimate the direction of arrival (DOA) of all…

Audio and Speech Processing · Electrical Eng. & Systems 2021-11-30 Aswin Shanmugam Subramanian , Chao Weng , Shinji Watanabe , Meng Yu , Dong Yu

Speaker diarization, the process of segmenting an audio stream or transcribed speech content into homogenous partitions based on speaker identity, plays a crucial role in the interpretation and analysis of human speech. Most existing…

Machine Learning · Computer Science 2024-08-23 Luyao Cheng , Hui Wang , Siqi Zheng , Yafeng Chen , Rongjie Huang , Qinglin Zhang , Qian Chen , Xihao Li

Overlapped speech is notoriously problematic for speaker diarization systems. Consequently, the use of speech separation has recently been proposed to improve their performance. Although promising, speech separation models struggle with…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-02 Elio Gruttadauria , Mathieu Fontaine , Slim Essid

We propose a modular pipeline for the single-channel separation, recognition, and diarization of meeting-style recordings and evaluate it on the Libri-CSS dataset. Using a Continuous Speech Separation (CSS) system with a TF-GridNet…

Audio and Speech Processing · Electrical Eng. & Systems 2024-05-07 Thilo von Neumann , Christoph Boeddeker , Tobias Cord-Landwehr , Marc Delcroix , Reinhold Haeb-Umbach

Automatic speaker diarization techniques typically involve a two-stage processing approach where audio segments of fixed duration are converted to vector representations in the first stage. This is followed by an unsupervised clustering of…

Audio and Speech Processing · Electrical Eng. & Systems 2021-06-15 Prachi Singh , Sriram Ganapathy

This work presents a novel approach for speaker diarization to leverage lexical information provided by automatic speech recognition. We propose a speaker diarization system that can incorporate word-level speaker turn probabilities with…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-16 Tae Jin Park , Kyu J. Han , Jing Huang , Xiaodong He , Bowen Zhou , Panayiotis Georgiou , Shrikanth Narayanan

When dealing with overlapped speech, the performance of automatic speech recognition (ASR) systems substantially degrades as they are designed for single-talker speech. To enhance ASR performance in conversational or meeting environments,…

Audio and Speech Processing · Electrical Eng. & Systems 2023-11-16 Hassan Taherian , DeLiang Wang

The prevailing noise-resistant and reverberation-resistant localization algorithms primarily emphasize separating and providing directional output for each speaker in multi-speaker scenarios, without association with the identity of…

Sound · Computer Science 2023-10-18 Yu Chen , Xinyuan Qian , Zexu Pan , Kainan Chen , Haizhou Li

We address talker-independent monaural speaker separation from the perspectives of deep learning and computational auditory scene analysis (CASA). Specifically, we decompose the multi-speaker separation task into the stages of simultaneous…

Sound · Computer Science 2019-04-26 Yuzhou Liu , DeLiang Wang
‹ Prev 1 2 3 10 Next ›