English
Related papers

Related papers: Segment Aggregation for short utterances speaker v…

200 papers

The short duration of an input utterance is one of the most critical threats that degrade the performance of speaker verification systems. This study aimed to develop an integrated text-independent speaker verification system that inputs…

Audio and Speech Processing · Electrical Eng. & Systems 2019-04-11 Jee-weon Jung , Hee-soo Heo , Hye-jin Shim , Ha-jin Yu

The performances of the automatic speaker verification (ASV) systems degrade due to the reduction in the amount of speech used for enrollment and verification. Combining multiple systems based on different features and classifiers…

Computer Vision and Pattern Recognition · Computer Science 2019-02-01 Arnab Poddar , Md Sahidullah , Goutam Saha

LSTM-based speaker verification usually uses a fixed-length local segment randomly truncated from an utterance to learn the utterance-level speaker embedding, while using the average embedding of all segments of a test utterance to verify…

Audio and Speech Processing · Electrical Eng. & Systems 2018-11-05 Bin Liu , Shuai Nie , Yaping Zhang , Shan Liang , Wenju Liu

The presence of non-speech segments in utterances often leads to the performance degradation of speaker verification. Existing systems usually use voice activation detection as a preprocessing step to cut off long silence segments. However,…

Audio and Speech Processing · Electrical Eng. & Systems 2025-08-21 Zijun Huang , Chengdong Liang , Jiadi Yao , Xiao-Lei Zhang

This work presents a novel back-end framework for speaker verification using graph attention networks. Segment-wise speaker embeddings extracted from multiple crops within an utterance are interpreted as node representations of a graph. The…

Audio and Speech Processing · Electrical Eng. & Systems 2021-02-09 Jee-weon Jung , Hee-Soo Heo , Ha-Jin Yu , Joon Son Chung

Currently, the most widely used approach for speaker verification is the deep speaker embedding learning. In this approach, we obtain a speaker embedding vector by pooling single-scale features that are extracted from the last layer of a…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-09 Youngmoon Jung , Seong Min Kye , Yeunju Choi , Myunghun Jung , Hoirin Kim

This paper analyses the short utterance probabilistic linear discriminant analysis (PLDA) speaker verification with utterance partitioning and short utterance variance (SUV) modelling approaches. Experimental studies have found that instead…

Sound · Computer Science 2016-10-18 Ahilan Kanagasundaram , David Dean , Sridha Sridharan , Clinton Fookes

Creating universal speaker encoders which are robust for different acoustic and speech duration conditions is a big challenge today. According to our observations systems trained on short speech segments are optimal for short phrase speaker…

Sound · Computer Science 2022-10-31 Sergey Novoselov , Vladimir Volokhov , Galina Lavrentyeva

One of the most important parts of an end-to-end speaker verification system is the speaker embedding generation. In our previous paper, we reported that shortcut connections-based multi-layer aggregation improves the representational power…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-29 Soonshin Seo , Ji-Hwan Kim

Speaker verification (SV) performance deteriorates as utterances become shorter. To this end, we propose a new architecture called VoiceExtender which provides a promising solution for improving SV performance when handling short-duration…

Sound · Computer Science 2023-10-10 Yayun He , Zuheng Kang , Jianzong Wang , Junqing Peng , Jing Xiao

Automatic speaker verification (ASV) is the process to recognize persons using voice as biometric. The ASV systems show considerable recognition performance with sufficient amount of speech from matched condition. One of the crucial…

Multimedia · Computer Science 2018-12-04 Arnab Poddar , Md Sahidullah , Goutam Saha

Despite achieving satisfactory performance in speaker verification using deep neural networks, variable-duration utterances remain a challenge that threatens the robustness of systems. To deal with this issue, we propose a speaker…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-28 Ju-ho Kim , Hye-jin Shim , Jungwoo Heo , Ha-Jin Yu

Identifying multiple speakers without knowing where a speaker's voice is in a recording is a challenging task. In this paper, a hierarchical attention network is proposed to solve a weakly labelled speaker identification problem. The use of…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-28 Yanpei Shi , Qiang Huang , Thomas Hain

In this paper, a novel architecture for speaker recognition is proposed by cascading speech enhancement and speaker processing. Its aim is to improve speaker recognition performance when speech signals are corrupted by noise. Instead of…

Computation and Language · Computer Science 2020-05-25 Yanpei Shi , Qiang Huang , Thomas Hain

In practical settings, a speaker recognition system needs to identify a speaker given a short utterance, while the enrollment utterance may be relatively long. However, existing speaker recognition models perform poorly with such short…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-12 Seong Min Kye , Youngmoon Jung , Hae Beom Lee , Sung Ju Hwang , Hoirin Kim

Verifying if two audio segments belong to the same speaker has been recently put forward as a flexible way to carry out speaker identification, since it does not require to be re-trained when new speakers appear on the auditory scene.…

Audio and Speech Processing · Electrical Eng. & Systems 2020-10-26 Ivette Velez , Caleb Rascon , Gibran Fuentes-Pineda

Speaker verification systems experience significant performance degradation when tasked with short-duration trial recordings. To address this challenge, a multi-scale feature fusion approach has been proposed to effectively capture speaker…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-05 Yafeng Chen , Siqi Zheng , Hui Wang , Luyao Cheng , Qian Chen , Shiliang Zhang , Junjie Li

In this paper, Whisper, a large-scale pre-trained model for automatic speech recognition, is proposed to apply to speaker verification. A partial multi-scale feature aggregation (PMFA) approach is proposed based on a subset of Whisper…

Sound · Computer Science 2024-08-29 Yiyang Zhao , Shuai Wang , Guangzhi Sun , Zehua Chen , Chao Zhang , Mingxing Xu , Thomas Fang Zheng

Speaker recognition performance has been greatly improved with the emergence of deep learning. Deep neural networks show the capacity to effectively deal with impacts of noise and reverberation, making them attractive to far-field speaker…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-28 Wenda Chen , Jonathan Huang , Tobias Bocklet

In this paper, a hierarchical attention network to generate utterance-level embeddings (H-vectors) for speaker identification is proposed. Since different parts of an utterance may have different contributions to speaker identities, the use…

Computation and Language · Computer Science 2019-10-22 Yanpei Shi , Qiang Huang , Thomas Hain
‹ Prev 1 2 3 10 Next ›