English
Related papers

Related papers: Multi-speaker Emotion Conversion via Latent Variab…

200 papers

Automated emotion detection in speech is a challenging task due to the complex interdependence between words and the manner in which they are spoken. It is made more difficult by the available datasets; their small size and incompatible…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-16 Amith Ananthram , Kailash Karthik Saravanakumar , Jessica Huynh , Homayoon Beigi

Speech emotion conversion is the task of modifying the perceived emotion of a speech utterance while preserving the lexical content and speaker identity. In this study, we cast the problem of emotion conversion as a spoken language…

Modern day conversational agents are trained to emulate the manner in which humans communicate. To emotionally bond with the user, these virtual agents need to be aware of the affective state of the user. Transformers are the recent state…

Sound · Computer Science 2022-04-26 Raman Goel , Seba Susan , Sachin Vashisht , Armaan Dhanda

Speech Emotion Recognition is a crucial area of research in human-computer interaction. While significant work has been done in this field, many state-of-the-art networks struggle to accurately recognize emotions in speech when the data is…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-23 Rashedul Hasan , Meher Nigar , Nursadul Mamun , Sayan Paul

Existing neural conversational models process natural language primarily on a lexico-syntactic level, thereby ignoring one of the most crucial components of human-to-human dialogue: its affective content. We take a step in this direction by…

Computation and Language · Computer Science 2017-09-14 Nabiha Asghar , Pascal Poupart , Jesse Hoey , Xin Jiang , Lili Mou

Emotional voice conversion aims to convert the emotion of speech from one state to another while preserving the linguistic content and speaker identity. The prior studies on emotional voice conversion are mostly carried out under the…

Sound · Computer Science 2020-10-14 Kun Zhou , Berrak Sisman , Mingyang Zhang , Haizhou Li

A general disentanglement-based speaker anonymization system typically separates speech into content, speaker, and prosody features using individual encoders. This paper explores how to adapt such a system when a new speech attribute, for…

We introduce SEDTalker, an emotion-aware framework for speech-driven 3D facial animation that leverages frame-level speech emotion diarization to achieve fine-grained expressive control. Unlike prior approaches that rely on utterance-level…

Computer Vision and Pattern Recognition · Computer Science 2026-04-16 Farzaneh Jafari , Stefano Berretti , Anup Basu

The Emotional Voice Conversion (EVC) aims to convert the discrete emotional state from the source emotion to the target for a given speech utterance while preserving linguistic content. In this paper, we propose regularizing emotion…

Audio and Speech Processing · Electrical Eng. & Systems 2024-12-31 Ashishkumar Gudmalwar , Ishan D. Biyani , Nirmesh Shah , Pankaj Wasnik , Rajiv Ratn Shah

Expressive voice conversion (VC) conducts speaker identity conversion for emotional speakers by jointly converting speaker identity and emotional style. Emotional style modeling for arbitrary speakers in expressive VC has not been…

Audio and Speech Processing · Electrical Eng. & Systems 2024-05-06 Zongyang Du , Junchen Lu , Kun Zhou , Lakshmish Kaushik , Berrak Sisman

Emotional voice conversion (EVC) is one way to generate expressive synthetic speech. Previous approaches mainly focused on modeling one-to-one mapping, i.e., conversion from one emotional state to another emotional state, with Mel-cepstral…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-09 Songxiang Liu , Yuewen Cao , Helen Meng

The mainstream paradigm of speech emotion recognition (SER) is identifying the single emotion label of the entire utterance. This line of works neglect the emotion dynamics at fine temporal granularity and mostly fail to leverage linguistic…

Sound · Computer Science 2024-03-29 Siyuan Shen , Yu Gao , Feng Liu , Hanyang Wang , Aimin Zhou

Speech emotion conversion aims to convert the expressed emotion of a spoken utterance to a target emotion while preserving the lexical information and the speaker's identity. In this work, we specifically focus on in-the-wild emotion…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-06 Navin Raj Prabhu , Nale Lehmann-Willenbrock , Timo Gerkmann

Precise control over speech characteristics, such as pitch, duration, and speech rate, remains a significant challenge in the field of voice conversion. The ability to manipulate parameters like pitch and syllable rate is an important…

Sound · Computer Science 2025-07-08 Mathilde Abrassart , Nicolas Obin , Axel Roebel

This paper proposes a Convolutional Neural Network (CNN) inspired by Multitask Learning (MTL) and based on speech features trained under the joint supervision of softmax loss and center loss, a powerful metric learning strategy, for the…

Sound · Computer Science 2019-09-04 Suraj Tripathi , Abhiram Ramesh , Abhay Kumar , Chirag Singh , Promod Yenigalla

While there have been significant advances in de-tecting emotions in text, in the field of utter-ance-level emotion recognition (ULER), there are still many problems to be solved. In this paper, we address some challenges in ULER in dialog…

Computation and Language · Computer Science 2020-02-19 QingBiao Li , ChunHua Wu , KangFeng Zheng , Zhe Wang

Emotional state of a speaker is found to have significant effect in speech production, which can deviate speech from that arising from neutral state. This makes identifying speakers with different emotions a challenging task as generally…

Audio and Speech Processing · Electrical Eng. & Systems 2020-10-09 Biswajit Dev Sarma , Rohan Kumar Das

Large language models are routinely deployed on text that varies widely in emotional tone, yet their reasoning behavior is typically evaluated without accounting for emotion as a source of representational variation. Prior work has largely…

Computation and Language · Computer Science 2026-03-17 Benjamin Reichman , Adar Avsian , Samuel Webster , Larry Heck

Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while preserving the linguistic content and speaker identity. In EVC, emotions are usually treated as discrete categories overlooking the fact that speech…

Sound · Computer Science 2022-07-19 Kun Zhou , Berrak Sisman , Rajib Rana , Björn W. Schuller , Haizhou Li

This paper introduces a new framework for non-parallel emotion conversion in speech. Our framework is based on two key contributions. First, we propose a stochastic version of the popular CycleGAN model. Our modified loss function…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-10 Ravi Shankar , Hsi-Wei Hsieh , Nicolas Charon , Archana Venkataraman
‹ Prev 1 2 3 10 Next ›