English
Related papers

Related papers: Nonparallel Emotional Speech Conversion

200 papers

Speech emotion conversion is the task of converting the expressed emotion of a spoken utterance to a target emotion while preserving the lexical content and speaker identity. While most existing works in speech emotion conversion rely on…

Audio and Speech Processing · Electrical Eng. & Systems 2024-01-09 Navin Raj Prabhu , Bunlong Lay , Simon Welker , Nale Lehmann-Willenbrock , Timo Gerkmann

Voice conversion (VC) techniques aim to modify speaker identity of an utterance while preserving the underlying linguistic information. Most VC approaches ignore modeling of the speaking style (e.g. emotion and emphasis), which may contain…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-20 Songxiang Liu , Yuewen Cao , Shiyin Kang , Na Hu , Xunying Liu , Dan Su , Dong Yu , Helen Meng

Speech emotion conversion is the task of modifying the perceived emotion of a speech utterance while preserving the lexical content and speaker identity. In this study, we cast the problem of emotion conversion as a spoken language…

Primary goal of an emotional voice conversion (EVC) system is to convert the emotion of a given speech signal from one style to another style without modifying the linguistic content of the signal. Most of the state-of-the-art approaches…

Sound · Computer Science 2023-02-22 Nirmesh Shah , Mayank Kumar Singh , Naoya Takahashi , Naoyuki Onoe

Emotional voice conversion aims to convert the emotion of speech from one state to another while preserving the linguistic content and speaker identity. The prior studies on emotional voice conversion are mostly carried out under the…

Sound · Computer Science 2020-10-14 Kun Zhou , Berrak Sisman , Mingyang Zhang , Haizhou Li

Text attribute transfer using non-parallel data requires methods that can perform disentanglement of content and linguistic attributes. In this work, we propose multiple improvements over the existing approaches that enable the…

Computation and Language · Computer Science 2017-12-06 Igor Melnyk , Cicero Nogueira dos Santos , Kahini Wadhawan , Inkit Padhi , Abhishek Kumar

Zero-shot emotion transfer in cross-lingual speech synthesis aims to transfer emotion from an arbitrary speech reference in the source language to the synthetic speech in the target language. Building such a system faces challenges of…

Sound · Computer Science 2023-10-09 Yuke Li , Xinfa Zhu , Yi Lei , Hai Li , Junhui Liu , Danming Xie , Lei Xie

In this paper, we propose a novel voice conversion strategy to resolve the mismatch between the training and conversion scenarios when parallel speech corpus is unavailable for training. Based on auto-encoder and disentanglement frameworks,…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-05 Yoohwan Kwon , Soo-Whan Chung , Hee-Soo Heo , Hong-Goo Kang

In expressive speech synthesis, there are high requirements for emotion interpretation. However, it is time-consuming to acquire emotional audio corpus for arbitrary speakers due to their deduction ability. In response to this problem, this…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-12 Pengfei Wu , Junjie Pan , Chenchang Xu , Junhui Zhang , Lin Wu , Xiang Yin , Zejun Ma

Humans can effortlessly modify various prosodic attributes, such as the placement of stress and the intensity of sentiment, to convey a specific emotion while maintaining consistent linguistic content. Motivated by this capability, we…

Sound · Computer Science 2023-12-29 Leyuan Qu , Wei Wang , Cornelius Weber , Pengcheng Yue , Taihao Li , Stefan Wermter

Although there has been significant advancement in the field of speech-to-speech translation, conventional models still require language-parallel speech data between the source and target languages for training. In this paper, we introduce…

Computation and Language · Computer Science 2024-03-21 Seung-Bin Kim , Sang-Hoon Lee , Seong-Whan Lee

We introduce a novel method for emotion conversion in speech that does not require parallel training data. Our approach loosely relies on a cycle-GAN schema to minimize the reconstruction error from converting back and forth between emotion…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-12 Ravi Shankar , Jacob Sager , Archana Venkataraman

Despite the remarkable progress made in synthesizing emotional speech from text, it is still challenging to provide emotion information to existing speech segments. Previous methods mainly rely on parallel data, and few works have studied…

Sound · Computer Science 2020-03-06 Xiaoqi Jia , Jianwei Tai , Hang Zhou , Yakai Li , Weijuan Zhang , Haichao Du , Qingjia Huang

This paper presents a method of sequence-to-sequence (seq2seq) voice conversion using non-parallel training data. In this method, disentangled linguistic and speaker representations are extracted from acoustic features, and voice conversion…

Audio and Speech Processing · Electrical Eng. & Systems 2020-01-14 Jing-Xuan Zhang , Zhen-Hua Ling , Li-Rong Dai

Expressing in language is subjective. Everyone has a different style of reading and writing, apparently it all boil downs to the way their mind understands things (in a specific format). Language style transfer is a way to preserve the…

Computation and Language · Computer Science 2018-04-12 Ayush Singh , Ritu Palod

Emotional voice conversion aims to convert the spectrum and prosody to change the emotional patterns of speech, while preserving the speaker identity and linguistic content. Many studies require parallel speech data between different…

Audio and Speech Processing · Electrical Eng. & Systems 2020-10-27 Kun Zhou , Berrak Sisman , Haizhou Li

In recent years, emotional text-to-speech has shown considerable progress. However, it requires a large amount of labeled data, which is not easily accessible. Even if it is possible to acquire an emotional speech dataset, there is still a…

Sound · Computer Science 2023-03-16 Suhee Jo , Younggun Lee , Yookyung Shin , Yeongtae Hwang , Taesu Kim

This paper focuses on style transfer on the basis of non-parallel text. This is an instance of a broad family of problems including machine translation, decipherment, and sentiment modification. The key challenge is to separate the content…

Computation and Language · Computer Science 2017-11-07 Tianxiao Shen , Tao Lei , Regina Barzilay , Tommi Jaakkola

We propose a method for speech-to-speech emotionpreserving translation that operates at the level of discrete speech units. Our approach relies on the use of multilingual emotion embedding that can capture affective information in a…

Audio and Speech Processing · Electrical Eng. & Systems 2023-07-03 Jarod Duret , Titouan Parcollet , Yannick Estève

Speech emotion conversion aims to convert the expressed emotion of a spoken utterance to a target emotion while preserving the lexical information and the speaker's identity. In this work, we specifically focus on in-the-wild emotion…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-06 Navin Raj Prabhu , Nale Lehmann-Willenbrock , Timo Gerkmann
‹ Prev 1 2 3 10 Next ›