English
Related papers

Related papers: Learnable MFCCs for Speaker Verification

200 papers

Modern automatic speaker verification relies largely on deep neural networks (DNNs) trained on mel-frequency cepstral coefficient (MFCC) features. While there are alternative feature extraction methods based on phase, prosody and long-term…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-31 Xuechen Liu , Md Sahidullah , Tomi Kinnunen

In this paper, we propose a novel family of windowing technique to compute Mel Frequency Cepstral Coefficient (MFCC) for automatic speaker recognition from speech. The proposed method is based on fundamental property of discrete time…

Computer Vision and Pattern Recognition · Computer Science 2015-06-05 Md. Sahidullah , Goutam Saha

Short time spectral features such as mel frequency cepstral coefficients(MFCCs) have been previously deployed in state of the art speaker recognition systems, however lesser heed has been paid to short term spectral features that can be…

Audio and Speech Processing · Electrical Eng. & Systems 2018-05-24 Adrish Banerjee , Akash Dubey , Abhishek Menon , Shubham Nanda , Gora Chand Nandi

With the development of speech synthesis techniques, automatic speaker verification systems face the serious challenge of spoofing attack. In order to improve the reliability of speaker verification systems, we develop a new filter bank…

Sound · Computer Science 2017-02-14 Hong Yu , Zheng-Hua Tan , Zhanyu Ma , Jun Guo

Mel-scale spectrum features are used in various recognition and classification tasks on speech signals. There is no reason to expect that these features are optimal for all different tasks, including speaker verification (SV). This paper…

Audio and Speech Processing · Electrical Eng. & Systems 2022-06-16 Jingyu Li , Yusheng Tian , Tan Lee

Mel Frequency Cepstral Coefficients (MFCCs) are the most popularly used speech features in most speech and speaker recognition applications. In this work, we propose a modified Mel filter bank to extract MFCCs from subsampled speech. We…

Computation and Language · Computer Science 2014-10-29 Kiran Kumar Bhuvanagiri , Sunil Kumar Kopparapu

To improve the performance of speaker identification systems, an effective and robust method is proposed to extract speech features, capable of operating in noisy environment. Based on the time-frequency multi-resolution property of wavelet…

Sound · Computer Science 2010-03-31 Mahmoud I. Abdalla , Hanaa S. Ali

An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC feature. For each…

Sound · Computer Science 2015-02-02 Zichen Ma , Ernest Fokoue

Even human intelligence system fails to offer 100% accuracy in identifying speeches from a specific individual. Machine intelligence is trying to mimic humans in speaker identification problems through various approaches to speech feature…

Audio and Speech Processing · Electrical Eng. & Systems 2022-09-30 Oluyemi E. Adetoyi

Multi-taper estimators provide low-variance power spectrum estimates that can be used in place of the windowed discrete Fourier transform (DFT) to extract speech features such as mel-frequency cepstral coefficients (MFCCs). Even if past…

Sound · Computer Science 2021-10-27 Xuechen Liu , Md Sahidullah , Tomi Kinnunen

After their introduction to robust speech recognition, power normalized cepstral coefficient (PNCC) features were successfully adopted to other tasks, including speaker verification. However, as a feature extractor with long-term operations…

Sound · Computer Science 2021-09-27 Xuechen Liu , Md Sahidullah , Tomi Kinnunen

Mel-frequency cepstral coefficients (MFCCs) are an important feature in speech processing. A deeper understanding of their properties can contribute to the work that is being done with both classical and deep learning models. This study…

Audio and Speech Processing · Electrical Eng. & Systems 2025-10-08 Vitor Magno de O. S. Bezerra , Gabriel F. A. Bastos , Jugurta Montalvão

Speech recognition and speaker identification are important for authentication and verification in security purpose, but they are difficult to achieve. Speaker identification methods can be divided into text-independent and text-dependent.…

Machine Learning · Computer Science 2010-09-28 S. M. Kamruzzaman , A. N. M. Rezaul Karim , Md. Saiful Islam , Md. Emdadul Haque

The objective of this work is to investigate complementary features which can aid the quintessential Mel frequency cepstral coefficients (MFCCs) in the task of closed, limited set word recognition for non-native English speakers of…

Sound · Computer Science 2022-06-16 Pierre Berjon , Rajib Sharma , Avishek Nag , Soumyabrata Dev

Most of the speech processing applications use triangular filters spaced in mel-scale for feature extraction. In this paper, we propose a new data-driven filter design method which optimizes filter parameters from a given speech data.…

Audio and Speech Processing · Electrical Eng. & Systems 2020-07-22 Susanta Sarangi , Md Sahidullah , Goutam Saha

Speech Emotion Recognition (SER) traditionally relies on auditory data analysis for emotion classification. Several studies have adopted different methods for SER. However, existing SER methods often struggle to capture subtle emotional…

Sound · Computer Science 2026-01-23 HyeYoung Lee , Muhammad Nadeem

Recently deep neural networks (DNNs) have been used to learn speaker features. However, the quality of the learned features is not sufficiently good, so a complex back-end model, either neural or probabilistic, has to be used to address the…

Sound · Computer Science 2017-05-11 Lantian Li , Yixiang Chen , Ying Shi , Zhiyuan Tang , Dong Wang

Mel Frequency Cepstral Coefficients (MFCCs) are the most popularly used speech features in most speech and speaker recognition applications. In this paper, we study the effect of resampling a speech signal on these speech features. We first…

Sound · Computer Science 2014-10-28 Laxmi Narayana M. , Sunil Kumar Kopparapu

This paper proposes a speech emotion recognition method based on speech features and speech transcriptions (text). Speech features such as Spectrogram and Mel-frequency Cepstral Coefficients (MFCC) help retain emotion-related low-level…

Audio and Speech Processing · Electrical Eng. & Systems 2019-06-14 Suraj Tripathi , Abhay Kumar , Abhiram Ramesh , Chirag Singh , Promod Yenigalla

The intersection of technology and mental health has spurred innovative approaches to assessing emotional well-being, particularly through computational techniques applied to audio data analysis. This study explores the application of…

Sound · Computer Science 2024-12-17 Idoko Agbo , Dr Hoda El-Sayed , M. D Kamruzzan Sarker
‹ Prev 1 2 3 10 Next ›