Related papers: Geometry-Preserving Encoder/Decoder in Latent Gene…

Generative Model without Prior Distribution Matching

Variational Autoencoder (VAE) and its variations are classic generative models by learning a low-dimensional latent representation to satisfy some prior distribution (e.g., Gaussian distribution). Their advantages over GAN are that they can…

Computer Vision and Pattern Recognition · Computer Science 2020-09-24 Cong Geng , Jia Wang , Li Chen , Zhiyong Gao

Geometric Autoencoder for Diffusion Models

Latent diffusion models have established a new state-of-the-art in high-resolution visual generation. Integrating Vision Foundation Model priors improves generative efficiency, yet existing latent designs remain largely heuristic. These…

Computer Vision and Pattern Recognition · Computer Science 2026-03-13 Hangyu Liu , Jianyong Wang , Yutao Sun

Complexity Matters: Rethinking the Latent Space for Generative Modeling

In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion models the latent space induced by an encoder and generates images through a paired decoder. Although the selection of…

Machine Learning · Computer Science 2023-10-31 Tianyang Hu , Fei Chen , Haonan Wang , Jiawei Li , Wenjia Wang , Jiacheng Sun , Zhenguo Li

Perceptual Generative Autoencoders

Modern generative models are usually designed to match target distributions directly in the data space, where the intrinsic dimension of data can be much lower than the ambient dimension. We argue that this discrepancy may contribute to the…

Machine Learning · Computer Science 2020-07-02 Zijun Zhang , Ruixiang Zhang , Zongpeng Li , Yoshua Bengio , Liam Paull

Variational Diffusion Auto-encoder: Latent Space Extraction from Pre-trained Diffusion Models

As a widely recognized approach to deep generative modeling, Variational Auto-Encoders (VAEs) still face challenges with the quality of generated images, often presenting noticeable blurriness. This issue stems from the unrealistic…

Machine Learning · Computer Science 2023-05-22 Georgios Batzolis , Jan Stanczuk , Carola-Bibiane Schönlieb

One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation

Visual generative models (e.g., diffusion models) typically operate in compressed latent spaces to balance training efficiency and sample quality. In parallel, there has been growing interest in leveraging high-quality pre-trained visual…

Computer Vision and Pattern Recognition · Computer Science 2025-12-17 Yuan Gao , Chen Chen , Tianrong Chen , Jiatao Gu

Geometry-Aware Hamiltonian Variational Auto-Encoder

Variational auto-encoders (VAEs) have proven to be a well suited tool for performing dimensionality reduction by extracting latent variables lying in a potentially much smaller dimensional space than the data. Their ability to capture…

Machine Learning · Statistics 2020-10-23 Clément Chadebec , Clément Mantoux , Stéphanie Allassonnière

CoVAE: Consistency Training of Variational Autoencoders

Current state-of-the-art generative approaches frequently rely on a two-stage training procedure, where an autoencoder (often a VAE) first performs dimensionality reduction, followed by training a generative model on the learned latent…

Machine Learning · Statistics 2025-07-15 Gianluigi Silvestri , Luca Ambrogioni

DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning

Autoencoders empower state-of-the-art image and video generative models by compressing pixels into a latent space through visual tokenization. Although recent advances have alleviated the performance degradation of autoencoders under high…

Computer Vision and Pattern Recognition · Computer Science 2026-01-14 Dongxu Liu , Jiahui Zhu , Yuang Peng , Haomiao Tang , Yuwei Chen , Chunrui Han , Zheng Ge , Daxin Jiang , Mingxue Liao

Latent Generative Modeling of Random Fields from Limited Training Data

The ability to accurately model random fields plays a critical role in science and engineering for problems involving uncertain, spatially-varying quantities such as heterogeneous material properties and turbulent flows. Deep generative…

Machine Learning · Computer Science 2026-05-04 James E. Warner , Tristan A. Shah , Patrick E. Leser , Geoffrey F. Bomarito , Joshua D. Pribe , Michael C. Stanley

Distribution Matching Variational AutoEncoder

Most visual generative models compress images into a latent space before applying diffusion or autoregressive modelling. Yet, existing approaches such as VAEs and foundation model aligned encoders implicitly constrain the latent space…

Computer Vision and Pattern Recognition · Computer Science 2025-12-09 Sen Ye , Jianning Pei , Mengde Xu , Shuyang Gu , Chunyu Wang , Liwei Wang , Han Hu

Representing 3D Shapes With 64 Latent Vectors for 3D Diffusion Models

Constructing a compressed latent space through a variational autoencoder (VAE) is the key for efficient 3D diffusion models. This paper introduces COD-VAE that encodes 3D shapes into a COmpact set of 1D latent vectors without sacrificing…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 In Cho , Youngbeom Yoo , Subin Jeon , Seon Joo Kim

Coupled Diffusion-Encoder Models for Reconstruction of Flow Fields

Data-driven flow-field reconstruction typically relies on autoencoder architectures that compress high-dimensional states into low-dimensional latent representations. However, classical approaches such as variational autoencoders (VAEs)…

Machine Learning · Computer Science 2026-01-14 AmirPouya Hemmasian , Amir Barati Farimani

Continuous Hierarchical Representations with Poincar\'e Variational Auto-Encoders

The variational auto-encoder (VAE) is a popular method for learning a generative model and embeddings of the data. Many real datasets are hierarchically structured. However, traditional VAEs map data in a Euclidean latent space which cannot…

Machine Learning · Statistics 2019-11-27 Emile Mathieu , Charline Le Lan , Chris J. Maddison , Ryota Tomioka , Yee Whye Teh

A Geometric Perspective on Variational Autoencoders

This paper introduces a new interpretation of the Variational Autoencoder framework by taking a fully geometric point of view. We argue that vanilla VAE models unveil naturally a Riemannian structure in their latent space and that taking…

Machine Learning · Statistics 2022-11-04 Clément Chadebec , Stéphanie Allassonnière

Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces

Modern visual world modeling systems increasingly rely on high-capacity architectures and large-scale data to produce plausible motion, yet they often fail to preserve underlying 3D geometry or physically consistent camera dynamics. A key…

Computer Vision and Pattern Recognition · Computer Science 2026-05-01 Andrew Bond , Ilkin Umut Melanlioglu , Erkut Erdem , Aykut Erdem

GLSR-VAE: Geodesic Latent Space Regularization for Variational AutoEncoder Architectures

VAEs (Variational AutoEncoders) have proved to be powerful in the context of density modeling and have been used in a variety of contexts for creative purposes. In many settings, the data we model possesses continuous attributes that we…

Machine Learning · Computer Science 2017-07-18 Gaëtan Hadjeres , Frank Nielsen , François Pachet

H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models

Autoencoder (AE) is the key to the success of latent diffusion models for image and video generation, reducing the denoising resolution and improving efficiency. However, the power of AE has long been underexplored in terms of network…

Computer Vision and Pattern Recognition · Computer Science 2025-10-02 Yushu Wu , Yanyu Li , Ivan Skorokhodov , Anil Kag , Willi Menapace , Sharath Girish , Aliaksandr Siarohin , Yanzhi Wang , Sergey Tulyakov

Guided Variational Autoencoder for Disentanglement Learning

We propose an algorithm, guided variational autoencoder (Guided-VAE), that is able to learn a controllable generative model by performing latent representation disentanglement learning. The learning objective is achieved by providing…

Computer Vision and Pattern Recognition · Computer Science 2020-04-06 Zheng Ding , Yifan Xu , Weijian Xu , Gaurav Parmar , Yang Yang , Max Welling , Zhuowen Tu

Hidden Talents of the Variational Autoencoder

Variational autoencoders (VAE) represent a popular, flexible form of deep generative model that can be stochastically fit to samples from a given random process using an information-theoretic variational bound on the true underlying…

Machine Learning · Computer Science 2019-10-08 Bin Dai , Yu Wang , John Aston , Gang Hua , David Wipf