Related papers: Automatic Speech Recognition Using Template Model …

Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques

Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the…

Multimedia · Computer Science 2010-03-23 Lindasalwa Muda , Mumtaj Begam , I. Elamvazuthi

Speech Emotion Recognition Using MFCC Features and LSTM-Based Deep Learning Model

Speech Emotion Recognition (SER) is the use of machines to detect the emotional state of humans based on the speech, which is gaining importance in natural human-computer interaction. Speech is a very valuable source of information, as…

Sound · Computer Science 2026-04-30 Adelekun Oluwademilade , Ademola Adedamola , Abiola Abdulhakeem , Akinpelu Azeezat , Eraiyetan Israel , Omotosho Oluwadunsin , Ibenye Ikechukwu , Ayuba Muhammad , Olusanya Olamide , Kamorudeen Amuda

Text Independent Speaker Identification System for Access Control

Even human intelligence system fails to offer 100% accuracy in identifying speeches from a specific individual. Machine intelligence is trying to mimic humans in speaker identification problems through various approaches to speech feature…

Audio and Speech Processing · Electrical Eng. & Systems 2022-09-30 Oluyemi E. Adetoyi

Feature selection using Fisher's ratio technique for automatic speech recognition

Automatic Speech Recognition involves mainly two steps; feature extraction and classification . Mel Frequency Cepstral Coefficient is used as one of the prominent feature extraction techniques in ASR. Usually, the set of all 12 MFCC…

Computation and Language · Computer Science 2015-05-14 Sarika Hegde , K. K. Achary , Surendra Shetty

Wavelet-Based Mel-Frequency Cepstral Coefficients for Speaker Identification using Hidden Markov Models

To improve the performance of speaker identification systems, an effective and robust method is proposed to extract speech features, capable of operating in noisy environment. Based on the time-frequency multi-resolution property of wavelet…

Sound · Computer Science 2010-03-31 Mahmoud I. Abdalla , Hanaa S. Ali

A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition

In this paper, we propose a novel family of windowing technique to compute Mel Frequency Cepstral Coefficient (MFCC) for automatic speaker recognition from speech. The proposed method is based on fundamental property of discrete time…

Computer Vision and Pattern Recognition · Computer Science 2015-06-05 Md. Sahidullah , Goutam Saha

Deep Learning based Emotion Recognition System Using Speech Features and Transcriptions

This paper proposes a speech emotion recognition method based on speech features and speech transcriptions (text). Speech features such as Spectrogram and Mel-frequency Cepstral Coefficients (MFCC) help retain emotion-related low-level…

Audio and Speech Processing · Electrical Eng. & Systems 2019-06-14 Suraj Tripathi , Abhay Kumar , Abhiram Ramesh , Chirag Singh , Promod Yenigalla

Keyword spotting -- Detecting commands in speech using deep learning

Speech recognition has become an important task in the development of machine learning and artificial intelligence. In this study, we explore the important task of keyword spotting using speech recognition machine learning and deep learning…

Sound · Computer Science 2023-12-12 Sumedha Rai , Tong Li , Bella Lyu

Detection and Analysis of Human Emotions through Voice and Speech Pattern Processing

The ability to modulate vocal sounds and generate speech is one of the features which set humans apart from other living beings. The human voice can be characterized by several attributes such as pitch, timbre, loudness, and vocal tone. It…

Computer Vision and Pattern Recognition · Computer Science 2017-10-30 Poorna Banerjee Dasgupta

Improving Automatic Emotion Recognition from speech using Rhythm and Temporal feature

This paper is devoted to improve automatic emotion recognition from speech by incorporating rhythm and temporal features. Research on automatic emotion recognition so far has mostly been based on applying features like MFCCs, pitch and…

Computer Vision and Pattern Recognition · Computer Science 2013-03-08 Mayank Bhargava , Tim Polzehl

On the Use of Different Feature Extraction Methods for Linear and Non Linear kernels

The speech feature extraction has been a key focus in robust speech recognition research; it significantly affects the recognition performance. In this paper, we first study a set of different features extraction methods such as linear…

Computation and Language · Computer Science 2014-07-01 Imen Trabelsi , Dorra Ben Ayed

Conversion of Acoustic Signal (Speech) Into Text By Digital Filter using Natural Language Processing

One of the most crucial aspects of communication in daily life is speech recognition. Speech recognition that is based on natural language processing is one of the essential elements in the conversion of one system to another. In this…

Artificial Intelligence · Computer Science 2022-09-12 Abhiram Katuri , Sindhu Salugu , Gelli Tharuni , Challa Sri Gouri

Audio Signal Processing Using Time Domain Mel-Frequency Wavelet Coefficient

Extracting features from the speech is the most critical process in speech signal processing. Mel Frequency Cepstral Coefficients (MFCC) are the most widely used features in the majority of the speaker and speech recognition applications,…

Sound · Computer Science 2025-10-31 Rinku Sebastian , Simon O'Keefe , Martin Trefzer

Adaptive Frequency Cepstral Coefficients for Word Mispronunciation Detection

Systems based on automatic speech recognition (ASR) technology can provide important functionality in computer assisted language learning applications. This is a young but growing area of research motivated by the large number of students…

Sound · Computer Science 2016-02-29 Zhenhao Ge , Sudhendu R. Sharma , Mark J. T. Smith

Speaker Identification using MFCC-Domain Support Vector Machine

Speech recognition and speaker identification are important for authentication and verification in security purpose, but they are difficult to achieve. Speaker identification methods can be divided into text-independent and text-dependent.…

Machine Learning · Computer Science 2010-09-28 S. M. Kamruzzaman , A. N. M. Rezaul Karim , Md. Saiful Islam , Md. Emdadul Haque

Spoken Language Identification Using Hybrid Feature Extraction Methods

This paper introduces and motivates the use of hybrid robust feature extraction technique for spoken language identification (LID) system. The speech recognizers use a parametric form of a signal to get the most important distinguishable…

Sound · Computer Science 2010-03-31 Pawan Kumar , Astik Biswas , A . N. Mishra , Mahesh Chandra

Emotion Recognition From Speech With Recurrent Neural Networks

In this paper the task of emotion recognition from speech is considered. Proposed approach uses deep recurrent neural network trained on a sequence of acoustic features calculated over small speech intervals. At the same time special…

Computation and Language · Computer Science 2018-07-06 Vladimir Chernykh , Pavel Prikhodko

Emotional Expression Detection in Spoken Language Employing Machine Learning Algorithms

There are a variety of features of the human voice that can be classified as pitch, timbre, loudness, and vocal tone. It is observed in numerous incidents that human expresses their feelings using different vocal qualities when they are…

Sound · Computer Science 2023-04-24 Mehrab Hosain , Most. Yeasmin Arafat , Gazi Zahirul Islam , Jia Uddin , Md. Mobarak Hossain , Fatema Alam

Speech-Driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition

Speech recognition has of late become a practical technology for real world applications. Aiming at speech-driven text retrieval, which facilitates retrieving information with spoken queries, we propose a method to integrate speech…

Computation and Language · Computer Science 2007-05-23 Atsushi Fujii , Katunobu Itou , Tetsuya Ishikawa

Speaker Independent Continuous Speech to Text Converter for Mobile Application

An efficient speech to text converter for mobile application is presented in this work. The prime motive is to formulate a system which would give optimum performance in terms of complexity, accuracy, delay and memory requirements for…

Computation and Language · Computer Science 2013-07-23 R. Sandanalakshmi , P. Abinaya Viji , M. Kiruthiga , M. Manjari , M. Sharina