English
Related papers

Related papers: Speech Emotion Recognition using Self-Supervised F…

200 papers

Speech Emotion Recognition (SER) often operates on speech segments detected by a Voice Activity Detection (VAD) model. However, VAD models may output flawed speech segments, especially in noisy environments, resulting in degraded…

Sound · Computer Science 2024-10-18 Natsuo Yamashita , Masaaki Yamamoto , Yohei Kawaguchi

Speech Emotion Recognition (SER) aims to help the machine to understand human's subjective emotion from only audio information. However, extracting and utilizing comprehensive in-depth audio information is still a challenging task. In this…

Sound · Computer Science 2022-03-30 Heqing Zou , Yuke Si , Chen Chen , Deepu Rajan , Eng Siong Chng

Speech Emotion Recognition (SER) plays a pivotal role in enhancing human-computer interaction by enabling a deeper understanding of emotional states across a wide range of applications, contributing to more empathetic and effective…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-25 Amirali Soltani Tehrani , Niloufar Faridani , Ramin Toosi

Speech is the most natural way of expressing ourselves as humans. Identifying emotion from speech is a nontrivial task due to the ambiguous definition of emotion itself. Speaker Emotion Recognition (SER) is essential for understanding human…

Sound · Computer Science 2024-11-07 Pourya Jafarzadeh , Amir Mohammad Rostami , Padideh Choobdar

Affective computing is very important in the relationship between man and machine. In this paper, a system for speech emotion recognition (SER) based on speech signal is proposed, which uses new techniques in different stages of processing.…

Sound · Computer Science 2021-11-16 Fatemeh Daneshfar , Seyed Jahanshah Kabudian

Emotion recognition is a challenging task due to limited availability of in-the-wild labeled datasets. Self-supervised learning has shown improvements on tasks with limited labeled datasets in domains like speech and natural language.…

Computation and Language · Computer Science 2021-04-08 Aparna Khare , Srinivas Parthasarathy , Shiva Sundaram

We propose a novel transfer learning method for speech emotion recognition allowing us to obtain promising results when only few training data is available. With as low as 125 examples per emotion class, we were able to reach a higher…

Machine Learning · Computer Science 2020-11-12 Jonathan Boigne , Biman Liyanage , Ted Östrem

This work presents our end-to-end (E2E) automatic speech recognition (ASR) model targetting at robust speech recognition, called Integraded speech Recognition with enhanced speech Input for Self-supervised learning representation (IRIS).…

Sound · Computer Science 2022-04-04 Xuankai Chang , Takashi Maekaku , Yuya Fujita , Shinji Watanabe

Transformer networks and self-supervised pre-training have consistently delivered state-of-art results in the field of natural language processing (NLP); however, their merits in the field of spoken language understanding (SLU) still need…

Computation and Language · Computer Science 2020-11-18 Edmilson Morais , Hong-Kwang J. Kuo , Samuel Thomas , Zoltan Tuske , Brian Kingsbury

End-to-end (E2E) systems have played a more and more important role in automatic speech recognition (ASR) and achieved great performance. However, E2E systems recognize output word sequences directly with the input acoustic feature, which…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-04 Qi Liu , Zhehuai Chen , Hao Li , Mingkun Huang , Yizhou Lu , Kai Yu

Speech Emotion Recognition (SER) involves analyzing vocal expressions to determine the emotional state of speakers, where the comprehensive and thorough utilization of audio information is paramount. Therefore, we propose a novel approach…

Audio and Speech Processing · Electrical Eng. & Systems 2025-04-29 Zixiang Wan , Ziyue Qiu , Yiyang Liu , Wei-Qiang Zhang

Emotion and intent recognition from speech is essential and has been widely investigated in human-computer interaction. The rapid development of social media platforms, chatbots, and other technologies has led to a large volume of speech…

Sound · Computer Science 2025-07-11 Zhao Ren , Rathi Adarshi Rammohan , Kevin Scheck , Sheng Li , Tanja Schultz

Emotion recognition models using audio input data can enable the development of interactive systems with applications in mental healthcare, marketing, gaming, and social media analysis. While the field of affective computing using audio…

Sound · Computer Science 2023-07-25 Peranut Nimitsurachat , Peter Washington

Speech emotion recognition is a challenging task and an important step towards more natural human-machine interaction. We show that pre-trained language models can be fine-tuned for text emotion recognition, achieving an accuracy of 69.5%…

Audio and Speech Processing · Electrical Eng. & Systems 2019-12-06 Verena Heusser , Niklas Freymuth , Stefan Constantin , Alex Waibel

Emotion recognition is a topic of significant interest in assistive robotics due to the need to equip robots with the ability to comprehend human behavior, facilitating their effective interaction in our society. Consequently, efficient and…

Human-Computer Interaction · Computer Science 2023-12-05 Rutherford Agbeshi Patamia , Paulo E. Santos , Kingsley Nketia Acheampong , Favour Ekong , Kwabena Sarpong , She Kun

In recent years, speech emotion recognition (SER) has been used in wide ranging applications, from healthcare to the commercial sector. In addition to signal processing approaches, methods for SER now also use deep learning techniques.…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-06 Sneha Das , Nicole Nadine Lønfeldt , Anne Katrine Pagsberg , Line H. Clemmensen

Emotion plays a fundamental role in human interaction, and therefore systems capable of identifying emotions in speech are crucial in the context of human-computer interaction. Speech emotion recognition (SER) is a challenging problem,…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-03 Lucas Ueda , João Lima , Leonardo Marques , Paula Costa

This paper presents a novel end-to-end LLM-empowered explainable speech emotion recognition (SER) approach. Fine-grained speech emotion descriptor (SED) features, e.g., pitch, tone and emphasis, are disentangled from HuBERT SSL…

Multilingual end-to-end (E2E) models have shown great promise in expansion of automatic speech recognition (ASR) coverage of the world's languages. They have shown improvement over monolingual systems, and have simplified training and…

Audio and Speech Processing · Electrical Eng. & Systems 2019-09-13 Anjuli Kannan , Arindrima Datta , Tara N. Sainath , Eugene Weinstein , Bhuvana Ramabhadran , Yonghui Wu , Ankur Bapna , Zhifeng Chen , Seungji Lee

Speech recognition applications cover a range of different audio and text distributions, with different speaking styles, background noise, transcription punctuation and character casing. However, many speech recognition systems require…

Computation and Language · Computer Science 2022-10-25 Sanchit Gandhi , Patrick von Platen , Alexander M. Rush
‹ Prev 1 2 3 10 Next ›