English
Related papers

Related papers: Constrained speaker linking

200 papers

Speaker attribution is required in many real-world applications, such as meeting transcription, where speaker identity is assigned to each utterance according to speaker voice profiles. In this paper, we propose to solve the speaker…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-09 Jixuan Wang , Xiong Xiao , Jian Wu , Ranjani Ramamurthy , Frank Rudzicz , Michael Brudno

Speech separation with several speakers is a challenging task because of the non-stationarity of the speech and the strong signal similarity between interferent sources. Current state-of-the-art solutions can separate well the different…

Signal Processing · Electrical Eng. & Systems 2021-02-09 Nicolas Furnon , Romain Serizel , Irina Illina , Slim Essid

Identifying multiple speakers without knowing where a speaker's voice is in a recording is a challenging task. In this paper, a hierarchical attention network is proposed to solve a weakly labelled speaker identification problem. The use of…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-28 Yanpei Shi , Qiang Huang , Thomas Hain

Speaker recognition deals with recognizing speakers by their speech. Most speaker recognition systems are built upon two stages, the first stage extracts low dimensional correlation embeddings from speech, and the second performs the…

Usable speech criteria are proposed to extract minimally corrupted speech for speaker identification (SID) in co-channel speech. In co-channel speech, either speaker can randomly appear as the stronger speaker or the weaker one at a time.…

Sound · Computer Science 2013-01-03 Wajdi Ghezaiel , Amel Ben Slimane , Ezzedine Ben Braiek

Neural models, in particular the d-vector and x-vector architectures, have produced state-of-the-art performance on many speaker verification tasks. However, two potential problems of these neural models deserve more investigation. Firstly,…

Audio and Speech Processing · Electrical Eng. & Systems 2019-02-19 Lantian Li , Zhiyuan Tang , Ying Shi , Dong Wang

Speaker diarization has gained considerable attention within speech processing research community. Mainstream speaker diarization rely primarily on speakers' voice characteristics extracted from acoustic signals and often overlook the…

Sound · Computer Science 2024-02-06 Luyao Cheng , Siqi Zheng , Qinglin Zhang , Hui Wang , Yafeng Chen , Qian Chen , Shiliang Zhang

Lately there have been novel developments in deep learning towards solving the cocktail party problem. Initial results are very promising and allow for more research in the domain. One technique that has not yet been explored in the neural…

Sound · Computer Science 2017-08-30 Jeroen Zegers , Hugo Van hamme

We present a unified network for voice separation of an unknown number of speakers. The proposed approach is composed of several separation heads optimized together with a speaker classification branch. The separation is carried out in the…

Sound · Computer Science 2020-11-05 Shlomo E. Chazan , Lior Wolf , Eliya Nachmani , Yossi Adi

Speaker embeddings are promising identity-related features that can enhance the identity assignment performance of a tracking system by leveraging its spatial predictions, i.e, by performing identity reassignment. Common speaker embedding…

Audio and Speech Processing · Electrical Eng. & Systems 2025-08-21 Taous Iatariene , Alexandre Guérin , Romain Serizel

We address the problem of acoustic source separation in a deep learning framework we call "deep clustering." Rather than directly estimating signals or masking functions, we train a deep network to produce spectrogram embeddings that are…

Neural and Evolutionary Computing · Computer Science 2015-08-19 John R. Hershey , Zhuo Chen , Jonathan Le Roux , Shinji Watanabe

In this paper, we address the problem of speaker recognition in challenging acoustic conditions using a novel method to extract robust speaker-discriminative speech representations. We adopt a recently proposed unsupervised adversarial…

Audio and Speech Processing · Electrical Eng. & Systems 2019-11-05 Raghuveer Peri , Monisankha Pal , Arindam Jati , Krishna Somandepalli , Shrikanth Narayanan

This paper presents a computationally efficient and distributed speaker diarization framework for networked IoT-style audio devices. The work proposes a Federated Learning model which can identify the participants in a conversation without…

Sound · Computer Science 2024-12-02 Amit Kumar Bhuyan , Hrishikesh Dutta , Subir Biswas

Speaker clustering is an essential step in conventional speaker diarization systems and is typically addressed as an audio-only speech processing task. The language used by the participants in a conversation, however, carries additional…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-12 Nikolaos Flemotomos , Shrikanth Narayanan

Clustering-based speaker diarization has stood firm as one of the major approaches in reality, despite recent development in end-to-end diarization. However, clustering methods have not been explored extensively for speaker diarization.…

Sound · Computer Science 2022-04-27 Siqi Zheng , Hongbin Suo

Speaker identification systems in a real-world scenario are tasked to identify a speaker amongst a set of enrolled speakers given just a few samples for each enrolled speaker. This paper demonstrates the effectiveness of meta-learning and…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-25 Ashutosh Chaubey , Sparsh Sinha , Susmita Ghose

Speaker identification in the household scenario (e.g., for smart speakers) is typically based on only a few enrollment utterances but a much larger set of unlabeled data, suggesting semisupervised learning to improve speaker profiles. We…

Sound · Computer Science 2022-02-22 Long Chen , Venkatesh Ravichandran , Andreas Stolcke

Speaker identification typically involves three stages. First, a front-end speaker embedding model is trained to embed utterance and speaker profiles. Second, a scoring function is applied between a runtime utterance and each speaker…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-22 Zhenning Tan , Yuguang Yang , Eunjung Han , Andreas Stolcke

In multilingual societies, social conversations often involve code-mixed speech. The current speech technology may not be well equipped to extract information from multi-lingual multi-speaker conversations. The DISPLACE challenge entails a…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-06 Shikha Baghel , Shreyas Ramoji , Sidharth , Ranjana H , Prachi Singh , Somil Jain , Pratik Roy Chowdhuri , Kaustubh Kulkarni , Swapnil Padhi , Deepu Vijayasenan , Sriram Ganapathy

Speaker clustering is the task of differentiating speakers in a recording. In a way, the aim is to answer "who spoke when" in audio recordings. A common method used in industry is feature extraction directly from the recording thanks to…

Sound · Computer Science 2018-03-23 Maxime Jumelle , Taqiyeddine Sakmeche
‹ Prev 1 2 3 10 Next ›