Related papers: Compact Speaker Embedding: lrx-vector

Speaker Recognition using SincNet and X-Vector Fusion

In this paper, we propose an innovative approach to perform speaker recognition by fusing two recently introduced deep neural networks (DNNs) namely - SincNet and X-Vector. The idea behind using SincNet filters on the raw speech waveform is…

Computation and Language · Computer Science 2020-04-07 Mayank Tripathi , Divyanshu Singh , Seba Susan

Length- and Noise-aware Training Techniques for Short-utterance Speaker Recognition

Speaker recognition performance has been greatly improved with the emergence of deep learning. Deep neural networks show the capacity to effectively deal with impacts of noise and reverberation, making them attractive to far-field speaker…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-28 Wenda Chen , Jonathan Huang , Tobias Bocklet

Deep Speaker Embedding Learning with Multi-Level Pooling for Text-Independent Speaker Verification

This paper aims to improve the widely used deep speaker embedding x-vector model. We propose the following improvements: (1) a hybrid neural network structure using both time delay neural network (TDNN) and long short-term memory neural…

Computation and Language · Computer Science 2019-02-22 Yun Tang , Guohong Ding , Jing Huang , Xiaodong He , Bowen Zhou

Improving Embedding Extraction for Speaker Verification with Ladder Network

Speaker verification is an established yet challenging task in speech processing and a very vibrant research area. Recent speaker verification (SV) systems rely on deep neural networks to extract high-level embeddings which are able to…

Audio and Speech Processing · Electrical Eng. & Systems 2020-03-23 Fei Tao , Gokhan Tur

Multi-Task Learning with High-Order Statistics for X-vector based Text-Independent Speaker Verification

The x-vector based deep neural network (DNN) embedding systems have demonstrated effectiveness for text-independent speaker verification. This paper presents a multi-task learning architecture for training the speaker embedding DNN with the…

Audio and Speech Processing · Electrical Eng. & Systems 2019-04-05 Lanhua You , Wu Guo , Lirong Dai , Jun Du

Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification

Even though deep speaker models have demonstrated impressive accuracy in speaker verification tasks, this often comes at the expense of increased model size and computation time, presenting challenges for deployment in resource-constrained…

Sound · Computer Science 2023-12-21 Xuechen Liu , Md Sahidullah , Tomi Kinnunen

Speaker Diarization with LSTM

For many years, i-vector based audio embedding techniques were the dominant approach for speaker verification and speaker diarization applications. However, mirroring the rise of deep learning in various domains, neural network based audio…

Audio and Speech Processing · Electrical Eng. & Systems 2022-01-25 Quan Wang , Carlton Downey , Li Wan , Philip Andrew Mansfield , Ignacio Lopez Moreno

Attention Mechanism in Speaker Recognition: What Does It Learn in Deep Speaker Embedding?

This paper presents an experimental study on deep speaker embedding with an attention mechanism that has been found to be a powerful representation learning technique in speaker recognition. In this framework, an attention model works as a…

Sound · Computer Science 2018-09-26 Qiongqiong Wang , Koji Okabe , Kong Aik Lee , Hitoshi Yamamoto , Takafumi Koshinaka

Deep Speaker Embeddings for Far-Field Speaker Recognition on Short Utterances

Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions according to the results obtained for early NIST SRE (Speaker Recognition Evaluation) datasets. From the practical…

Sound · Computer Science 2020-02-17 Aleksei Gusev , Vladimir Volokhov , Tseren Andzhukaev , Sergey Novoselov , Galina Lavrentyeva , Marina Volkova , Alice Gazizullina , Andrey Shulipa , Artem Gorlanov , Anastasia Avdeeva , Artem Ivanov , Alexander Kozlov , Timur Pekhovsky , Yuri Matveev

SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification

Recently, x-vector has been a successful and popular approach for speaker verification, which employs a time delay neural network (TDNN) and statistics pooling to extract speaker characterizing embedding from variable-length utterances.…

Sound · Computer Science 2022-01-02 Wentao Zhu , Tianlong Kong , Shun Lu , Jixiang Li , Dawei Zhang , Feng Deng , Xiaorui Wang , Sen Yang , Ji Liu

What Does the Speaker Embedding Encode?

Developing a good speaker embedding has received tremendous interest in the speech community, with representations such as i-vector and d-vector demonstrating remarkable performance across various tasks. Despite their widespread adoption, a…

Audio and Speech Processing · Electrical Eng. & Systems 2025-12-23 Shuai Wang , Yanmin Qian , Kai Yu

Generative x-vectors for text-independent speaker verification

Speaker verification (SV) systems using deep neural network embeddings, so-called the x-vector systems, are becoming popular due to its good performance superior to the i-vector systems. The fusion of these systems provides improved…

Audio and Speech Processing · Electrical Eng. & Systems 2018-09-19 Longting Xu , Rohan Kumar Das , Emre Yılmaz , Jichen Yang , Haizhou Li

Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification

Speaker verification systems usually suffer from the mismatch problem between training and evaluation data, such as speaker population mismatch, the channel and environment variations. In order to address this issue, it requires the system…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-09 Xu Li , Jinghua Zhong , Jianwei Yu , Shoukang Hu , Xixin Wu , Xunying Liu , Helen Meng

ECAPA-TDNN Embeddings for Speaker Diarization

Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural networks can accurately capture speaker discriminative characteristics and popular deep embeddings such as x-vectors are nowadays a fundamental…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-14 Nauman Dawalatabad , Mirco Ravanelli , François Grondin , Jenthe Thienpondt , Brecht Desplanques , Hwidong Na

BUT VOiCES 2019 System Description

This is a description of our effort in VOiCES 2019 Speaker Recognition challenge. All systems in the fixed condition are based on the x-vector paradigm with different features and DNN topologies. The single best system reaches 1.2% EER and…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-16 Hossein Zeinali , Pavel Matějka , Ladislav Mošner , Oldřich Plchot , Anna Silnova , Ondřej Novotný , Ján Profant , Ondřej Glembek , Lukáš Burget

Speaker Embedding Extraction with Phonetic Information

Speaker embeddings achieve promising results on many speaker verification tasks. Phonetic information, as an important component of speech, is rarely considered in the extraction of speaker embeddings. In this paper, we introduce phonetic…

Sound · Computer Science 2018-06-15 Yi Liu , Liang He , Jia Liu , Michael T. Johnson

Data augmentation versus noise compensation for x- vector speaker recognition systems in noisy environments

The explosion of available speech data and new speaker modeling methods based on deep neural networks (DNN) have given the ability to develop more robust speaker recognition systems. Among DNN speaker modelling techniques, x-vector system…

Sound · Computer Science 2020-06-30 Mohammad Mohammadamini , Driss Matrouf

Cross-lingual Speaker Verification with Deep Feature Learning

Existing speaker verification (SV) systems often suffer from performance degradation if there is any language mismatch between model training, speaker enrollment, and test. A major cause of this degradation is that most existing SV methods…

Sound · Computer Science 2017-06-27 Lantian Li , Dong Wang , Askar Rozi , Thomas Fang Zheng

X-Vectors with Multi-Scale Aggregation for Speaker Diarization

Speaker diarization is the process of labeling different speakers in a speech signal. Deep speaker embeddings are generally extracted from short speech segments and clustered to determine the segments belong to same speaker identity. The…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-18 Myungjong Kim , Vijendra Raj Apsingekar , Divya Neelagiri

Probing the Information Encoded in X-vectors

Deep neural network based speaker embeddings, such as x-vectors, have been shown to perform well in text-independent speaker recognition/verification tasks. In this paper, we use simple classifiers to investigate the contents encoded by…

Audio and Speech Processing · Electrical Eng. & Systems 2020-06-16 Desh Raj , David Snyder , Daniel Povey , Sanjeev Khudanpur