Related papers: Automatic Speech Recognition Using Template Model …
Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the…
Speech Emotion Recognition (SER) is the use of machines to detect the emotional state of humans based on the speech, which is gaining importance in natural human-computer interaction. Speech is a very valuable source of information, as…
Even human intelligence system fails to offer 100% accuracy in identifying speeches from a specific individual. Machine intelligence is trying to mimic humans in speaker identification problems through various approaches to speech feature…
Automatic Speech Recognition involves mainly two steps; feature extraction and classification . Mel Frequency Cepstral Coefficient is used as one of the prominent feature extraction techniques in ASR. Usually, the set of all 12 MFCC…
To improve the performance of speaker identification systems, an effective and robust method is proposed to extract speech features, capable of operating in noisy environment. Based on the time-frequency multi-resolution property of wavelet…
In this paper, we propose a novel family of windowing technique to compute Mel Frequency Cepstral Coefficient (MFCC) for automatic speaker recognition from speech. The proposed method is based on fundamental property of discrete time…
This paper proposes a speech emotion recognition method based on speech features and speech transcriptions (text). Speech features such as Spectrogram and Mel-frequency Cepstral Coefficients (MFCC) help retain emotion-related low-level…
Speech recognition has become an important task in the development of machine learning and artificial intelligence. In this study, we explore the important task of keyword spotting using speech recognition machine learning and deep learning…
The ability to modulate vocal sounds and generate speech is one of the features which set humans apart from other living beings. The human voice can be characterized by several attributes such as pitch, timbre, loudness, and vocal tone. It…
This paper is devoted to improve automatic emotion recognition from speech by incorporating rhythm and temporal features. Research on automatic emotion recognition so far has mostly been based on applying features like MFCCs, pitch and…
The speech feature extraction has been a key focus in robust speech recognition research; it significantly affects the recognition performance. In this paper, we first study a set of different features extraction methods such as linear…
One of the most crucial aspects of communication in daily life is speech recognition. Speech recognition that is based on natural language processing is one of the essential elements in the conversion of one system to another. In this…
Extracting features from the speech is the most critical process in speech signal processing. Mel Frequency Cepstral Coefficients (MFCC) are the most widely used features in the majority of the speaker and speech recognition applications,…
Systems based on automatic speech recognition (ASR) technology can provide important functionality in computer assisted language learning applications. This is a young but growing area of research motivated by the large number of students…
Speech recognition and speaker identification are important for authentication and verification in security purpose, but they are difficult to achieve. Speaker identification methods can be divided into text-independent and text-dependent.…
This paper introduces and motivates the use of hybrid robust feature extraction technique for spoken language identification (LID) system. The speech recognizers use a parametric form of a signal to get the most important distinguishable…
In this paper the task of emotion recognition from speech is considered. Proposed approach uses deep recurrent neural network trained on a sequence of acoustic features calculated over small speech intervals. At the same time special…
There are a variety of features of the human voice that can be classified as pitch, timbre, loudness, and vocal tone. It is observed in numerous incidents that human expresses their feelings using different vocal qualities when they are…
Speech recognition has of late become a practical technology for real world applications. Aiming at speech-driven text retrieval, which facilitates retrieving information with spoken queries, we propose a method to integrate speech…
An efficient speech to text converter for mobile application is presented in this work. The prime motive is to formulate a system which would give optimum performance in terms of complexity, accuracy, delay and memory requirements for…