English
Related papers

Related papers: Guided Speaker Embedding

200 papers

Obtaining high-quality speaker embeddings in multi-speaker conditions is crucial for many applications. A recently proposed guided speaker embedding framework, which utilizes speech activities of target and non-target speakers as clues,…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-17 Shota Horiguchi , Takanori Ashihara , Marc Delcroix , Atsushi Ando , Naohiro Tawara

This paper proposes a method for extracting speaker embedding for each speaker from a variable-length recording containing multiple speakers. Speaker embeddings are crucial not only for speaker recognition but also for various multi-speaker…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-02 Shota Horiguchi , Atsushi Ando , Takafumi Moriya , Takanori Ashihara , Hiroshi Sato , Naohiro Tawara , Marc Delcroix

The performance of speaker verification degrades significantly when the test speech is corrupted by interference speakers. Speaker diarization does well to separate speakers if the speakers are temporally overlapped. However, if…

Audio and Speech Processing · Electrical Eng. & Systems 2019-02-08 Wei Rao , Chenglin Xu , Eng Siong Chng , Haizhou Li

The speaker extraction technique seeks to single out the voice of a target speaker from the interfering voices in a speech mixture. Typically an auxiliary reference of the target speaker is used to form voluntary attention. Either a…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-10 Zexu Pan , Wupeng Wang , Marvin Borsdorf , Haizhou Li

End-to-end neural speaker diarization systems are able to address the speaker diarization task while effectively handling speech overlap. This work explores the incorporation of speaker information embeddings into the end-to-end systems to…

Sound · Computer Science 2024-07-02 Juan Ignacio Alvarez-Trejos , Beltrán Labrador , Alicia Lozano-Diez

The objective of this work is effective speaker diarisation using multi-scale speaker embeddings. Typically, there is a trade-off between the ability to recognise short speaker segments and the discriminative power of the embedding,…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-11 Youngki Kwon , Hee-Soo Heo , Jee-weon Jung , You Jin Kim , Bong-Jin Lee , Joon Son Chung

Traditional speech separation and speaker diarization approaches rely on prior knowledge of target speakers or a predetermined number of participants in audio signals. To address these limitations, recent advances focus on developing…

We propose a new method for speaker diarization that can handle overlapping speech with 2+ people. Our method is based on compositional embeddings [1]: Like standard speaker embedding methods such as x-vector [2], compositional embedding…

Sound · Computer Science 2021-02-11 Zeqian Li , Jacob Whitehill

Informed speaker extraction aims to extract a target speech signal from a mixture of sources given prior knowledge about the desired speaker. Recent deep learning-based methods leverage a speaker discriminative model that maps a reference…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-17 Mohamed Elminshawi , Wolfgang Mack , Emanuël A. P. Habets

In multi-speaker applications is common to have pre-computed models from enrolled speakers. Using these models to identify the instances in which these speakers intervene in a recording is the task of speaker tracking. In this paper, we…

The goal of this paper is to adapt speaker embeddings for solving the problem of speaker diarisation. The quality of speaker embeddings is paramount to the performance of speaker diarisation systems. Despite this, prior works in the field…

Audio and Speech Processing · Electrical Eng. & Systems 2021-04-08 Youngki Kwon , Jee-weon Jung , Hee-Soo Heo , You Jin Kim , Bong-Jin Lee , Joon Son Chung

We introduce a monaural neural speaker embeddings extractor that computes an embedding for each speaker present in a speech mixture. To allow for supervised training, a teacher-student approach is employed: the teacher computes the target…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-20 Tobias Cord-Landwehr , Christoph Boeddeker , Cătălin Zorilă , Rama Doddipatla , Reinhold Haeb-Umbach

This paper proposes an online target speaker voice activity detection system for speaker diarization tasks, which does not require a priori knowledge from the clustering-based diarization system to obtain the target speaker embeddings. By…

Audio and Speech Processing · Electrical Eng. & Systems 2023-10-16 Weiqing Wang , Ming Li

Speaker extraction requires a sample speech from the target speaker as the reference. However, enrolling a speaker with a long speech is not practical. We propose a speaker extraction technique, that performs in multiple stages to take full…

Audio and Speech Processing · Electrical Eng. & Systems 2021-04-05 Meng Ge , Chenglin Xu , Longbiao Wang , Eng Siong Chng , Jianwu Dang , Haizhou Li

Speaker embedding extractors (EEs), which map input audio to a speaker discriminant latent space, are of paramount importance in speaker diarisation. However, there are several challenges when adopting EEs for diarisation, from which we…

Current speaker diarization systems rely on an external voice activity detection model prior to speaker embedding extraction on the detected speech segments. In this paper, we establish that the attention system of a speaker embedding…

Audio and Speech Processing · Electrical Eng. & Systems 2024-05-16 Jenthe Thienpondt , Kris Demuynck

Despite the recent success of deep learning for many speech processing tasks, single-microphone, speaker-independent speech separation remains challenging for two main reasons. The first reason is the arbitrary order of the target and…

Sound · Computer Science 2018-04-19 Yi Luo , Zhuo Chen , Nima Mesgarani

Recently, end-to-end speaker extraction has attracted increasing attention and shown promising results. However, its performance is often inferior to that of a blind source separation (BSS) counterpart with a similar network architecture,…

Audio and Speech Processing · Electrical Eng. & Systems 2022-04-05 Zifeng Zhao , Dongchao Yang , Rongzhi Gu , Haoran Zhang , Yuexian Zou

Target speaker extraction aims to extract the speech of a specific speaker from a multi-talker mixture as specified by an auxiliary reference. Most studies focus on the scenario where the target speech is highly overlapped with the…

Sound · Computer Science 2023-09-18 Junjie Li , Ruijie Tao , Zexu Pan , Meng Ge , Shuai Wang , Haizhou Li

The objective of this work is to train noise-robust speaker embeddings adapted for speaker diarisation. Speaker embeddings play a crucial role in the performance of diarisation systems, but they often capture spurious information such as…

Sound · Computer Science 2022-11-04 You Jin Kim , Hee-Soo Heo , Jee-weon Jung , Youngki Kwon , Bong-Jin Lee , Joon Son Chung
‹ Prev 1 2 3 10 Next ›