English
Related papers

Related papers: Deep Normalization for Speaker Vectors

200 papers

Deep speaker embedding represents the state-of-the-art technique for speaker recognition. A key problem with this approach is that the resulting deep speaker vectors tend to be irregularly distributed. In previous research, we proposed a…

Sound · Computer Science 2020-11-02 Yunqi Cai , Lantian Li , Dong Wang , Andrew Abel

A deep learning approach has been proposed recently to derive speaker identifies (d-vector) by a deep neural network (DNN). This approach has been applied to text-dependent speaker recognition tasks and shows reasonable performance gains…

Computation and Language · Computer Science 2015-06-30 Lantian Li , Yiye Lin , Zhiyong Zhang , Dong Wang

In this paper, we propose an iterative framework for self-supervised speaker representation learning based on a deep neural network (DNN). The framework starts with training a self-supervision speaker embedding network by maximizing…

Audio and Speech Processing · Electrical Eng. & Systems 2020-10-29 Danwei Cai , Weiqing Wang , Ming Li

Recently, speaker embeddings extracted from a speaker discriminative deep neural network (DNN) yield better performance than the conventional methods such as i-vector. In most cases, the DNN speaker classifier is trained using cross entropy…

Audio and Speech Processing · Electrical Eng. & Systems 2019-06-19 Xu Xiang , Shuai Wang , Houjun Huang , Yanmin Qian , Kai Yu

Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural networks can accurately capture speaker discriminative characteristics and popular deep embeddings such as x-vectors are nowadays a fundamental…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-14 Nauman Dawalatabad , Mirco Ravanelli , François Grondin , Jenthe Thienpondt , Brecht Desplanques , Hwidong Na

Neural models, in particular the d-vector and x-vector architectures, have produced state-of-the-art performance on many speaker verification tasks. However, two potential problems of these neural models deserve more investigation. Firstly,…

Audio and Speech Processing · Electrical Eng. & Systems 2019-02-19 Lantian Li , Zhiyuan Tang , Ying Shi , Dong Wang

This paper presents an improved deep embedding learning method based on convolutional neural network (CNN) for text-independent speaker verification. Two improvements are proposed for x-vector embedding learning: (1) Multi-scale convolution…

Audio and Speech Processing · Electrical Eng. & Systems 2020-01-15 Bin Gu , Wu Guo

This paper aims to improve the widely used deep speaker embedding x-vector model. We propose the following improvements: (1) a hybrid neural network structure using both time delay neural network (TDNN) and long short-term memory neural…

Computation and Language · Computer Science 2019-02-22 Yun Tang , Guohong Ding , Jing Huang , Xiaodong He , Bowen Zhou

We address far-field speaker verification with deep neural network (DNN) based speaker embedding extractor, where mismatch between enrollment and test data often comes from convolutive effects (e.g. room reverberation) and noise. To…

Sound · Computer Science 2021-09-27 Xuechen Liu , Md Sahidullah , Tomi Kinnunen

This paper proposes novel algorithms for speaker embedding using subjective inter-speaker similarity based on deep neural networks (DNNs). Although conventional DNN-based speaker embedding such as a $d$-vector can be applied to…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-22 Yuki Saito , Shinnosuke Takamichi , Hiroshi Saruwatari

Recently, speaker embeddings extracted with deep neural networks became the state-of-the-art method for speaker verification. In this paper we aim to facilitate its implementation on a more generic toolkit than Kaldi, which we anticipate to…

Sound · Computer Science 2018-11-07 Hossein Zeinali , Lukas Burget , Johan Rohdin , Themos Stafylakis , Jan Cernocky

Linear Discriminant Analysis (LDA) has been used as a standard post-processing procedure in many state-of-the-art speaker recognition tasks. Through maximizing the inter-speaker difference and minimizing the intra-speaker variation, LDA…

Sound · Computer Science 2018-05-04 Shuai Wang , Zili Huang , Yanmin Qian , Kai Yu

We present Deep Speaker, a neural speaker embedding system that maps utterances to a hypersphere where speaker similarity is measured by cosine similarity. The embeddings generated by Deep Speaker can be used for many tasks, including…

Computation and Language · Computer Science 2017-05-08 Chao Li , Xiaokong Ma , Bing Jiang , Xiangang Li , Xuewei Zhang , Xiao Liu , Ying Cao , Ajay Kannan , Zhenyao Zhu

In multi-speaker applications is common to have pre-computed models from enrolled speakers. Using these models to identify the instances in which these speakers intervene in a recording is the task of speaker tracking. In this paper, we…

Modern automatic speaker verification relies largely on deep neural networks (DNNs) trained on mel-frequency cepstral coefficient (MFCC) features. While there are alternative feature extraction methods based on phase, prosody and long-term…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-31 Xuechen Liu , Md Sahidullah , Tomi Kinnunen

In this paper we propose a new method of speaker diarization that employs a deep learning architecture to learn speaker embeddings. In contrast to the traditional approaches that build their speaker embeddings using manually hand-crafted…

Sound · Computer Science 2017-09-18 Pawel Cyrta , Tomasz Trzciński , Wojciech Stokowiec

Over the recent years, various deep learning-based methods were proposed for extracting a fixed-dimensional embedding vector from speech signals. Although the deep learning-based embedding extraction methods have shown good performance in…

Audio and Speech Processing · Electrical Eng. & Systems 2021-12-08 Woo Hyun Kang , Jahangir Alam , Abderrahim Fathan

Over the recent years, various deep learning-based embedding methods have been proposed and have shown impressive performance in speaker verification. However, as in most of the classical embedding techniques, the deep learning-based…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-10 Woo Hyun Kang , Sung Hwan Mun , Min Hyun Han , Nam Soo Kim

Recent advancements in speaker verification techniques show promise, but their performance often deteriorates significantly in challenging acoustic environments. Although speech enhancement methods can improve perceived audio quality, they…

Audio and Speech Processing · Electrical Eng. & Systems 2025-08-27 Adam Katav , Yair Moshe , Israel Cohen

After their introduction to robust speech recognition, power normalized cepstral coefficient (PNCC) features were successfully adopted to other tasks, including speaker verification. However, as a feature extractor with long-term operations…

Sound · Computer Science 2021-09-27 Xuechen Liu , Md Sahidullah , Tomi Kinnunen
‹ Prev 1 2 3 10 Next ›