English
Related papers

Related papers: Voice Conversion from Non-parallel Corpora Using V…

200 papers

We propose a flexible framework that deals with both singer conversion and singers vocal technique conversion. The proposed model is trained on non-parallel corpora, accommodates many-to-many conversion, and leverages recent advances of…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-26 Yin-Jyun Luo , Chin-Chen Hsu , Kat Agres , Dorien Herremans

An effective approach to non-parallel voice conversion (VC) is to utilize deep neural networks (DNNs), specifically variational auto encoders (VAEs), to model the latent structure of speech in an unsupervised manner. A previous study has…

Audio and Speech Processing · Electrical Eng. & Systems 2020-04-09 Wen-Chin Huang , Hsin-Te Hwang , Yu-Huai Peng , Yu Tsao , Hsin-Min Wang

Building a voice conversion (VC) system from non-parallel speech corpora is challenging but highly valuable in real application scenarios. In most situations, the source and the target speakers do not repeat the same texts or they may even…

Computation and Language · Computer Science 2017-06-09 Chin-Cheng Hsu , Hsin-Te Hwang , Yi-Chiao Wu , Yu Tsao , Hsin-Min Wang

In this paper, we present a novel technique for a non-parallel voice conversion (VC) with the use of cyclic variational autoencoder (CycleVAE)-based spectral modeling. In a variational autoencoder(VAE) framework, a latent space, usually…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-25 Patrick Lumban Tobing , Yi-Chiao Wu , Tomoki Hayashi , Kazuhiro Kobayashi , Tomoki Toda

Semantic encoders and decoders for digital semantic communication (SC) often struggle to adapt to variations in unpredictable channel environments and diverse system designs. To address these challenges, this paper proposes a novel…

Signal Processing · Electrical Eng. & Systems 2025-03-20 Yongjeong Oh , Joohyuk Park , Jinho Choi , Jihong Park , Yo-Seb Jeon

As a foundational technology for intelligent human-computer interaction, voice conversion (VC) seeks to transform speech from any source timbre into any target timbre. Traditional voice conversion methods based on Generative Adversarial…

Sound · Computer Science 2025-06-11 Wenhan Yao , Fen Xiao , Xiarun Chen , Jia Liu , YongQiang He , Weiping Wen

In this paper, we propose a novel voice conversion strategy to resolve the mismatch between the training and conversion scenarios when parallel speech corpus is unavailable for training. Based on auto-encoder and disentanglement frameworks,…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-05 Yoohwan Kwon , Soo-Whan Chung , Hee-Soo Heo , Hong-Goo Kang

Nowadays, recognition-synthesis-based methods have been quite popular with voice conversion (VC). By introducing linguistics features with good disentangling characters extracted from an automatic speech recognition (ASR) model, the VC…

Sound · Computer Science 2023-05-17 Xintao Zhao , Shuai Wang , Yang Chao , Zhiyong Wu , Helen Meng

We present a method for converting the voices between a set of speakers. Our method is based on training multiple autoencoder paths, where there is a single speaker-independent encoder and multiple speaker-dependent decoders. The…

Audio and Speech Processing · Electrical Eng. & Systems 2019-05-13 Orhan Ocal , Oguz H. Elibol , Gokce Keskin , Cory Stephenson , Anil Thomas , Kannan Ramchandran

Recently, cycle-consistent adversarial network (Cycle-GAN) has been successfully applied to voice conversion to a different speaker without parallel data, although in those approaches an individual model is needed for each target speaker.…

Audio and Speech Processing · Electrical Eng. & Systems 2018-06-26 Ju-chieh Chou , Cheng-chieh Yeh , Hung-yi Lee , Lin-shan Lee

One-shot voice conversion(VC) aims to change the timbre of any source speech to match that of the target speaker with only one speech sample. Existing style transfer-based VC methods relied on speech representation disentanglement and…

Sound · Computer Science 2024-11-26 Wenhan Yao , Zedong Xing , Xiarun Chen , Jia Liu , Yongqiang He , Weiping Wen

Singing voice conversion aims to convert singer's voice from source to target without changing singing content. Parallel training data is typically required for the training of singing voice conversion system, that is however not practical…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-04 Junchen Lu , Kun Zhou , Berrak Sisman , Haizhou Li

We present a modification to the spectrum differential based direct waveform modification for voice conversion (DIFFVC) so that it can be directly applied as a waveform generation module to voice conversion models. The recently proposed…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-30 Wen-Chin Huang , Yi-Chiao Wu , Kazuhiro Kobayashi , Yu-Huai Peng , Hsin-Te Hwang , Patrick Lumban Tobing , Yu Tsao , Hsin-Min Wang , Tomoki Toda

Recent research has shown that word embedding spaces learned from text corpora of different languages can be aligned without any parallel data supervision. Inspired by the success in unsupervised cross-lingual word embeddings, in this paper…

Computation and Language · Computer Science 2018-09-24 Yu-An Chung , Wei-Hung Weng , Schrasing Tong , James Glass

In this paper, we present an open-source software for developing a nonparallel voice conversion (VC) system named crank. Although we have released an open-source VC software based on the Gaussian mixture model named sprocket in the last VC…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-05 Kazuhiro Kobayashi , Wen-Chin Huang , Yi-Chiao Wu , Patrick Lumban Tobing , Tomoki Hayashi , Tomoki Toda

This paper presents a method of sequence-to-sequence (seq2seq) voice conversion using non-parallel training data. In this method, disentangled linguistic and speaker representations are extracted from acoustic features, and voice conversion…

Audio and Speech Processing · Electrical Eng. & Systems 2020-01-14 Jing-Xuan Zhang , Zhen-Hua Ling , Li-Rong Dai

We study the problem of cross-lingual voice conversion in non-parallel speech corpora and one-shot learning setting. Most prior work require either parallel speech corpora or enough amount of training data from a target speaker. However, we…

Sound · Computer Science 2018-08-17 Seyed Hamidreza Mohammadi , Taehwan Kim

This paper introduces FastVC, an end-to-end model for fast Voice Conversion (VC). The proposed model can convert speech of arbitrary length from multiple source speakers to multiple target speakers. FastVC is based on a conditional…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-07 Oriol Barbany Mayor , Milos Cernak

Voice conversion is to generate a new speech with the source content and a target voice style. In this paper, we focus on one general setting, i.e., non-parallel many-to-many voice conversion, which is close to the real-world scenario. As…

Sound · Computer Science 2022-07-28 Jian Ma , Zhedong Zheng , Hao Fei , Feng Zheng , Tat-seng Chua , Yi Yang

Voice conversion (VC) using sequence-to-sequence learning of context posterior probabilities is proposed. Conventional VC using shared context posterior probabilities predicts target speech parameters from the context posterior…

Sound · Computer Science 2017-08-08 Hiroyuki Miyoshi , Yuki Saito , Shinnosuke Takamichi , Hiroshi Saruwatari
‹ Prev 1 2 3 10 Next ›