English
Related papers

Related papers: AE-Flow: AutoEncoder Normalizing Flow

200 papers

Non-parallel voice conversion (VC) is typically achieved using lossy representations of the source speech. However, ensuring only speaker identity information is dropped whilst all other information from the source speech is retained is a…

Audio and Speech Processing · Electrical Eng. & Systems 2022-03-16 Thomas Merritt , Abdelhamid Ezzerg , Piotr Biliński , Magdalena Proszewska , Kamil Pokora , Roberto Barra-Chicote , Daniel Korzekwa

Creating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of…

Normalizing Flows (NFs) learn invertible mappings between the data and a Gaussian distribution. Prior works usually suffer from two limitations. First, they add random noise to training samples or VAE latents as data augmentation,…

Computer Vision and Pattern Recognition · Computer Science 2025-12-04 Qinyu Zhao , Guangting Zheng , Tao Yang , Rui Zhu , Xingjian Leng , Stephen Gould , Liang Zheng

In voice conversion (VC) applications, diffusion and flow-matching models have exhibited exceptional speech quality and speaker similarity performances. However, they are limited by slow conversion owing to their iterative inference.…

Sound · Computer Science 2026-02-23 Takuhiro Kaneko , Hirokazu Kameoka , Kou Tanaka , Yuto Kondo

End-to-end models for raw audio generation are a challenge, specially if they have to work with non-parallel data, which is a desirable setup in many situations. Voice conversion, in which a model has to impersonate a speaker in a…

Machine Learning · Computer Science 2019-09-06 Joan Serrà , Santiago Pascual , Carlos Segura

Recent research showed that an autoencoder trained with speech of a single speaker, called exemplar autoencoder (eAE), can be used for any-to-one voice conversion (VC). Compared to large-scale many-to-many models such as AutoVC, the eAE…

Sound · Computer Science 2022-04-13 Weida Liang , Lantian Li , Wenqiang Du , Dong Wang

Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete…

Machine Learning · Statistics 2019-06-06 Zachary M. Ziegler , Alexander M. Rush

Flow models have rapidly become the go-to method for training and deploying large-scale generators, owing their success to inference-time flexibility via adjustable integration steps. A crucial ingredient in flow training is the choice of…

Normalizing flows and variational autoencoders are powerful generative models that can represent complicated density functions. However, they both impose constraints on the models: Normalizing flows use bijective transformations to model…

Machine Learning · Computer Science 2020-11-02 Didrik Nielsen , Priyank Jaini , Emiel Hoogeboom , Ole Winther , Max Welling

Normalizing Flows are generative models that directly maximize the likelihood. Previously, the design of normalizing flows was largely constrained by the need for analytical invertibility. We overcome this constraint by a training procedure…

Machine Learning · Computer Science 2024-04-25 Felix Draxler , Peter Sorrenson , Lea Zimmermann , Armand Rousselot , Ullrich Köthe

Normalizing flows are powerful non-parametric statistical models that function as a hybrid between density estimators and generative models. Current learning algorithms for normalizing flows assume that data points are sampled…

Machine Learning · Computer Science 2023-05-31 Matthias Kirchler , Christoph Lippert , Marius Kloft

Given datasets from multiple domains, a key challenge is to efficiently exploit these data sources for modeling a target domain. Variants of this problem have been studied in many contexts, such as cross-domain translation and domain…

Machine Learning · Computer Science 2019-12-24 Aditya Grover , Christopher Chute , Rui Shu , Zhangjie Cao , Stefano Ermon

In this paper, we establish a connection between the parameterization of flow-based and energy-based generative models, and present a new flow-based modeling approach called energy-based normalizing flow (EBFlow). We demonstrate that by…

Machine Learning · Computer Science 2023-10-31 Chen-Hao Chao , Wei-Fang Sun , Yen-Chang Hsu , Zsolt Kira , Chun-Yi Lee

Normalizing Flows (NFs) have been established as a principled framework for generative modeling. Standard NFs consist of a forward process and a reverse process: the forward process maps data to noise, while the reverse process generates…

Machine Learning · Computer Science 2025-12-12 Yiyang Lu , Qiao Sun , Xianbang Wang , Zhicheng Jiang , Hanhong Zhao , Kaiming He

Text style transfer aims to alter the style of a sentence while preserving its content. Due to the lack of parallel corpora, most recent work focuses on unsupervised methods and often uses cycle construction to train models. Since cycle…

Computation and Language · Computer Science 2022-12-20 Kangchen Zhu , Zhiliang Tian , Ruifeng Luo , Xiaoguang Mao

Video anomaly detection is often seen as one-class classification (OCC) problem due to the limited availability of anomaly examples. Typically, to tackle this problem, an autoencoder (AE) is trained to reconstruct the input with training…

Computer Vision and Pattern Recognition · Computer Science 2021-10-26 Marcella Astrid , Muhammad Zaigham Zaheer , Jae-Yeong Lee , Seung-Ik Lee

Wavelet transformation stands as a cornerstone in modern data analysis and signal processing. Its mathematical essence is an invertible transformation that discerns slow patterns from fast ones in the frequency domain. Such an invertible…

Machine Learning · Computer Science 2022-01-28 Shuo-Hui Li

Generative models have excelled in audio tasks using approaches such as language models, diffusion, and flow matching. However, existing generative approaches for speech enhancement (SE) face notable challenges: language model-based methods…

Audio and Speech Processing · Electrical Eng. & Systems 2025-05-28 Ziqian Wang , Zikai Liu , Xinfa Zhu , Yike Zhu , Mingshuai Liu , Jun Chen , Longshuai Xiao , Chao Weng , Lei Xie

We propose SelfVC, a training strategy to iteratively improve a voice conversion model with self-synthesized examples. Previous efforts on voice conversion focus on factorizing speech into explicitly disentangled representations that…

This paper proposes a general enhancement to the Normalizing Flows (NF) used in neural vocoding. As a case study, we improve expressive speech vocoding with a revamped Parallel Wavenet (PW). Specifically, we propose to extend the affine…

Audio and Speech Processing · Electrical Eng. & Systems 2022-02-17 Adam Gabryś , Yunlong Jiao , Viacheslav Klimkov , Daniel Korzekwa , Roberto Barra-Chicote
‹ Prev 1 2 3 10 Next ›