Related papers: Efficient-VDVAE: Less is more

Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images

We present a hierarchical VAE that, for the first time, generates samples quickly while outperforming the PixelCNN in log-likelihood on all natural image benchmarks. We begin by observing that, in theory, VAEs can actually represent…

Machine Learning · Computer Science 2021-03-18 Rewon Child

Optimizing Hierarchical Image VAEs for Sample Quality

While hierarchical variational autoencoders (VAEs) have achieved great density estimation on image modeling tasks, samples from their prior tend to look less convincing than models with similar log-likelihood. We attribute this to learned…

Machine Learning · Computer Science 2022-10-20 Eric Luhman , Troy Luhman

NVAE: A Deep Hierarchical Variational Autoencoder

Normalizing flows, autoregressive models, variational autoencoders (VAEs), and deep energy-based models are among competing likelihood-based frameworks for deep generative learning. Among them, VAEs have the advantage of fast and tractable…

Machine Learning · Statistics 2021-01-11 Arash Vahdat , Jan Kautz

Deep Hierarchical Video Compression

Recently, probabilistic predictive coding that directly models the conditional distribution of latent features across successive frames for temporal redundancy removal has yielded promising results. Existing methods using a single-scale…

Image and Video Processing · Electrical Eng. & Systems 2023-12-13 Ming Lu , Zhihao Duan , Fengqing Zhu , Zhan Ma

Hierarchical Quantized Autoencoders

Despite progress in training neural networks for lossy image compression, current approaches fail to maintain both perceptual quality and abstract features at very low bitrates. Encouraged by recent success in learning discrete…

Machine Learning · Computer Science 2020-10-19 Will Williams , Sam Ringer , Tom Ash , John Hughes , David MacLeod , Jamie Dougherty

High Fidelity Image Synthesis With Deep VAEs In Latent Space

We present fast, realistic image generation on high-resolution, multimodal datasets using hierarchical variational autoencoders (VAEs) trained on a deterministic autoencoder's latent space. In this two-stage setup, the autoencoder…

Computer Vision and Pattern Recognition · Computer Science 2023-03-27 Troy Luhman , Eric Luhman

LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models

Recent advances in Latent Video Diffusion Models (LVDMs) have revolutionized video generation by leveraging Video Variational Autoencoders (Video VAEs) to compress intricate video data into a compact latent space. However, as LVDM training…

Computer Vision and Pattern Recognition · Computer Science 2025-03-19 Yu Cheng , Fajie Yuan

Generalizing Variational Autoencoders with Hierarchical Empirical Bayes

Variational Autoencoders (VAEs) have experienced recent success as data-generating models by using simple architectures that do not require significant fine-tuning of hyperparameters. However, VAEs are known to suffer from…

Machine Learning · Statistics 2020-07-22 Wei Cheng , Gregory Darnell , Sohini Ramachandran , Lorin Crawford

BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling

With the introduction of the variational autoencoder (VAE), probabilistic latent variable models have received renewed attention as powerful generative models. However, their performance in terms of test likelihood and quality of generated…

Machine Learning · Statistics 2020-01-13 Lars Maaløe , Marco Fraccaro , Valentin Liévin , Ole Winther

LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models

Advances in latent diffusion models (LDMs) have revolutionized high-resolution image generation, but the design space of the autoencoder that is central to these systems remains underexplored. In this paper, we introduce LiteVAE, a new…

Machine Learning · Computer Science 2025-01-22 Seyedmorteza Sadat , Jakob Buhmann , Derek Bradley , Otmar Hilliges , Romann M. Weber

Discouraging posterior collapse in hierarchical Variational Autoencoders using context

Hierarchical Variational Autoencoders (VAEs) are among the most popular likelihood-based generative models. There is a consensus that the top-down hierarchical VAEs allow effective learning of deep latent structures and avoid problems like…

Machine Learning · Computer Science 2023-09-29 Anna Kuzina , Jakub M. Tomczak

Note: Variational Encoding of Protein Dynamics Benefits from Maximizing Latent Autocorrelation

As deep Variational Auto-Encoder (VAE) frameworks become more widely used for modeling biomolecular simulation data, we emphasize the capability of the VAE architecture to concurrently maximize the timescale of the latent space while…

Chemical Physics · Physics 2021-12-08 Hannah K. Wayment-Steele , Vijay S. Pande

Relaxed-Responsibility Hierarchical Discrete VAEs

Successfully training Variational Autoencoders (VAEs) with a hierarchy of discrete latent variables remains an area of active research. Vector-Quantised VAEs are a powerful approach to discrete VAEs, but naive hierarchical extensions can be…

Machine Learning · Statistics 2021-02-05 Matthew Willetts , Xenia Miscouridou , Stephen Roberts , Chris Holmes

High-Efficiency Neural Video Compression via Hierarchical Predictive Learning

The enhanced Deep Hierarchical Video Compression-DHVC 2.0-has been introduced. This single-model neural video codec operates across a broad range of bitrates, delivering not only superior compression performance to representative methods…

Image and Video Processing · Electrical Eng. & Systems 2024-10-04 Ming Lu , Zhihao Duan , Wuyang Cong , Dandan Ding , Fengqing Zhu , Zhan Ma

SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer

Efficient image tokenization with high compression ratios remains a critical challenge for training generative models. We present SoftVQ-VAE, a continuous image tokenizer that leverages soft categorical posteriors to aggregate multiple…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Hao Chen , Ze Wang , Xiang Li , Ximeng Sun , Fangyi Chen , Jiang Liu , Jindong Wang , Bhiksha Raj , Zicheng Liu , Emad Barsoum

DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment

Reducing token count is crucial for efficient training and inference of latent diffusion models, especially at high resolution. A common strategy is to build high-compression image tokenizers with more channels per token. However, when…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Xin Cai , Zhiyuan You , Zhoutong Zhang , Tianfan Xue

DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents

Diffusion probabilistic models have been shown to generate state-of-the-art results on several competitive image synthesis benchmarks but lack a low-dimensional, interpretable latent space, and are slow at generation. On the other hand,…

Machine Learning · Computer Science 2022-11-30 Kushagra Pandey , Avideep Mukherjee , Piyush Rai , Abhishek Kumar

Hierarchical Vector-Quantized Latents for Perceptual Low-Resolution Video Compression

The exponential growth of video traffic has placed increasing demands on bandwidth and storage infrastructure, particularly for content delivery networks (CDNs) and edge devices. While traditional video codecs like H.264 and HEVC achieve…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Manikanta Kotthapalli , Banafsheh Rekabdar

Split Hierarchical Variational Compression

Variational autoencoders (VAEs) have witnessed great success in performing the compression of image datasets. This success, made possible by the bits-back coding framework, has produced competitive compression performance across many…

Image and Video Processing · Electrical Eng. & Systems 2022-04-06 Tom Ryder , Chen Zhang , Ning Kang , Shifeng Zhang

Lossy Image Compression with Quantized Hierarchical VAEs

Recent research has shown a strong theoretical connection between variational autoencoders (VAEs) and the rate-distortion theory. Motivated by this, we consider the problem of lossy image compression from the perspective of generative…

Image and Video Processing · Electrical Eng. & Systems 2023-03-28 Zhihao Duan , Ming Lu , Zhan Ma , Fengqing Zhu