Related papers: Histogram Transform-based Speaker Identification
To improve the performance of speaker identification systems, an effective and robust method is proposed to extract speech features, capable of operating in noisy environment. Based on the time-frequency multi-resolution property of wavelet…
In this paper, we propose a novel family of windowing technique to compute Mel Frequency Cepstral Coefficient (MFCC) for automatic speaker recognition from speech. The proposed method is based on fundamental property of discrete time…
This paper proposes a novel Wavelet Packet based feature extraction approach for the task of text independent speaker recognition. The features are extracted by using the combination of Mel Frequency Cepstral Coefficient (MFCC) and Wavelet…
Speech recognition and speaker identification are important for authentication and verification in security purpose, but they are difficult to achieve. Speaker identification methods can be divided into text-independent and text-dependent.…
We propose a new feature, namely, pitchsynchronous discrete cosine transform (PS-DCT), for the task of speaker identification. These features are obtained directly from the voiced segments of the speech signal, without any preemphasis or…
Even human intelligence system fails to offer 100% accuracy in identifying speeches from a specific individual. Machine intelligence is trying to mimic humans in speaker identification problems through various approaches to speech feature…
Due to improvements in artificial intelligence, speaker identification (SI) technologies have brought a great direction and are now widely used in a variety of sectors. One of the most important components of SI is feature extraction, which…
Speaker verification is the process by which a speakers claim of identity is tested against a claimed speaker by his or her voice. Speaker verification is done by the use of some parameters (features) from the speakers voice which can be…
Feature extraction plays an important role as a front-end processing block in speaker identification (SI) process. Most of the SI systems utilize like Mel-Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), Linear…
This paper introduces the performance evaluation of statistical approaches for TextIndependent speaker recognition system using source feature. Linear prediction LP residual is used as a representation of excitation information in speech.…
This paper introduces and motivates the use of hybrid robust feature extraction technique for spoken language identification (LID) system. The speech recognizers use a parametric form of a signal to get the most important distinguishable…
In this paper, we combine Hidden Markov Models (HMMs) with i-vector extractors to address the problem of text-dependent speaker recognition with random digit strings. We employ digit-specific HMMs to segment the utterances into digits, to…
Speech is a natural form of communication for human beings, and computers with the ability to understand speech and speak with a human voice are expected to contribute to the development of more natural man-machine interfaces. Computers…
The most pressing challenge in the field of voice biometrics is selecting the most efficient technique of speaker recognition. Every individual's voice is peculiar, factors like physical differences in vocal organs, accent and pronunciation…
Speech Emotion Recognition (SER) is the use of machines to detect the emotional state of humans based on the speech, which is gaining importance in natural human-computer interaction. Speech is a very valuable source of information, as…
In this work, we incorporated acoustically derived source features, aperiodicity, periodicity and pitch as additional targets to an acoustic-to-articulatory speech inversion (SI) system. We also propose a Temporal Convolution based SI…
We propose a learnable mel-frequency cepstral coefficient (MFCC) frontend architecture for deep neural network (DNN) based automatic speaker verification. Our architecture retains the simplicity and interpretability of MFCC-based features…
Extracting features from the speech is the most critical process in speech signal processing. Mel Frequency Cepstral Coefficients (MFCC) are the most widely used features in the majority of the speaker and speech recognition applications,…
Systems based on automatic speech recognition (ASR) technology can provide important functionality in computer assisted language learning applications. This is a young but growing area of research motivated by the large number of students…
In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the Neuro- Genetic hybrid algorithm with cepstral based features.…