Related papers: LL-VQ-VAE: Learnable Lattice Vector-Quantization F…

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical…

Machine Learning · Computer Science 2024-03-29 Yuhta Takida , Yukara Ikemiya , Takashi Shibuya , Kazuki Shimada , Woosung Choi , Chieh-Hsin Lai , Naoki Murata , Toshimitsu Uesaka , Kengo Uchida , Wei-Hsiang Liao , Yuki Mitsufuji

Robust Training of Vector Quantized Bottleneck Models

In this paper we demonstrate methods for reliable and efficient training of discrete representation using Vector-Quantized Variational Auto-Encoder models (VQ-VAEs). Discrete latent variable models have been shown to learn nontrivial…

Machine Learning · Computer Science 2024-09-13 Adrian Łańcucki , Jan Chorowski , Guillaume Sanchez , Ricard Marxer , Nanxin Chen , Hans J. G. A. Dolfing , Sameer Khurana , Tanel Alumäe , Antoine Laurent

Neural Discrete Representation Learning

Learning useful representations without supervision remains a key challenge in machine learning. In this paper, we propose a simple yet powerful generative model that learns such discrete representations. Our model, the Vector…

Machine Learning · Computer Science 2018-05-31 Aaron van den Oord , Oriol Vinyals , Koray Kavukcuoglu

Learning Product Codebooks using Vector Quantized Autoencoders for Image Retrieval

Vector-Quantized Variational Autoencoders (VQ-VAE)[1] provide an unsupervised model for learning discrete representations by combining vector quantization and autoencoders. In this paper, we study the use of VQ-VAE for representation…

Image and Video Processing · Electrical Eng. & Systems 2019-03-05 Hanwei Wu , Markus Flierl

SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

One noted issue of vector-quantized variational autoencoder (VQ-VAE) is that the learned discrete representation uses only a fraction of the full capacity of the codebook, also known as codebook collapse. We hypothesize that the training…

Machine Learning · Computer Science 2022-06-10 Yuhta Takida , Takashi Shibuya , WeiHsiang Liao , Chieh-Hsin Lai , Junki Ohmura , Toshimitsu Uesaka , Naoki Murata , Shusuke Takahashi , Toshiyuki Kumakura , Yuki Mitsufuji

VP-VAE: Rethinking Vector Quantization via Adaptive Vector Perturbation

Vector Quantized Variational Autoencoders (VQ-VAEs) are fundamental to modern generative modeling, yet they often suffer from training instability and "codebook collapse" due to the inherent coupling of representation learning and discrete…

Machine Learning · Computer Science 2026-02-20 Linwei Zhai , Han Ding , Mingzhi Lin , Cui Zhao , Fei Wang , Ge Wang , Wang Zhi , Wei Xi

VAEVQ: Enhancing Discrete Visual Tokenization through Variational Modeling

Vector quantization (VQ) transforms continuous image features into discrete representations, providing compressed, tokenized inputs for generative models. However, VQ-based frameworks suffer from several issues, such as non-smooth latent…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Sicheng Yang , Xing Hu , Qiang Wu , Dawei Yang

L-VAE: Variational Auto-Encoder with Learnable Beta for Disentangled Representation

In this paper, we propose a novel model called Learnable VAE (L-VAE), which learns a disentangled representation together with the hyperparameters of the cost function. L-VAE can be considered as an extension of \b{eta}-VAE, wherein the…

Machine Learning · Computer Science 2025-07-04 Hazal Mogultay Ozcan , Sinan Kalkan , Fatos T. Yarman-Vural

Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression

It is customary to deploy uniform scalar quantization in the end-to-end optimized Neural image compression methods, instead of more powerful vector quantization, due to the high complexity of the latter. Lattice vector quantization (LVQ),…

Image and Video Processing · Electrical Eng. & Systems 2024-11-26 Xi Zhang , Xiaolin Wu

Finite Scalar Quantization: VQ-VAE Made Simple

We propose to replace vector quantization (VQ) in the latent representation of VQ-VAEs with a simple scheme termed finite scalar quantization (FSQ), where we project the VAE representation down to a few dimensions (typically less than 10).…

Computer Vision and Pattern Recognition · Computer Science 2023-10-13 Fabian Mentzer , David Minnen , Eirikur Agustsson , Michael Tschannen

Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Vector Quantization (VQ) is essential for discretizing continuous representations in unsupervised learning but suffers from representation collapse, causing low codebook utilization and limiting scalability. Existing solutions often rely on…

Machine Learning · Computer Science 2025-10-06 Yongxin Zhu , Bocheng Li , Yifei Xin , Zhihua Xia , Linli Xu

Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation

We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of…

Computer Vision and Pattern Recognition · Computer Science 2022-08-10 Mohammad Adiban , Kalin Stefanov , Sabato Marco Siniscalchi , Giampiero Salvi

Vector Quantized Wasserstein Auto-Encoder

Learning deep discrete latent presentations offers a promise of better symbolic and summarized abstractions that are more useful to subsequent downstream tasks. Inspired by the seminal Vector Quantized Variational Auto-Encoder (VQ-VAE),…

Machine Learning · Computer Science 2023-06-21 Tung-Long Vuong , Trung Le , He Zhao , Chuanxia Zheng , Mehrtash Harandi , Jianfei Cai , Dinh Phung

Theory and Experiments on Vector Quantized Autoencoders

Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks. There has been a surge in interest in discrete latent variable models, however,…

Machine Learning · Computer Science 2018-07-23 Aurko Roy , Ashish Vaswani , Arvind Neelakantan , Niki Parmar

Hierarchical Vector-Quantized Latents for Perceptual Low-Resolution Video Compression

The exponential growth of video traffic has placed increasing demands on bandwidth and storage infrastructure, particularly for content delivery networks (CDNs) and edge devices. While traditional video codecs like H.264 and HEVC achieve…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Manikanta Kotthapalli , Banafsheh Rekabdar

Lattice Representation Learning

In this article we introduce theory and algorithms for learning discrete representations that take on a lattice that is embedded in an Euclidean space. Lattice representations possess an interesting combination of properties: a) they can be…

Machine Learning · Computer Science 2020-06-25 Luis A. Lastras

LASERS: LAtent Space Encoding for Representations with Sparsity for Generative Modeling

Learning compact and meaningful latent space representations has been shown to be very useful in generative modeling tasks for visual data. One particular example is applying Vector Quantization (VQ) in variational autoencoders (VQ-VAEs,…

Machine Learning · Computer Science 2024-09-18 Xin Li , Anand Sarwate

Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization

Vector Quantization (VQ) is a method for discretizing latent representations and has become a major part of the deep learning toolkit. It has been theoretically and empirically shown that discretization of representations leads to improved…

Machine Learning · Computer Science 2022-02-04 Dianbo Liu , Alex Lamb , Xu Ji , Pascal Notsawo , Mike Mozer , Yoshua Bengio , Kenji Kawaguchi

Learning source-aware representations of music in a discrete latent space

In recent years, neural network based methods have been proposed as a method that cangenerate representations from music, but they are not human readable and hardly analyzable oreditable by a human. To address this issue, we propose a novel…

Audio and Speech Processing · Electrical Eng. & Systems 2021-11-29 Jinsung Kim , Yeong-Seok Jeong , Woosung Choi , Jaehwa Chung , Soonyoung Jung

PCA-VAE: Differentiable Subspace Quantization without Codebook Collapse

Vector-quantized autoencoders deliver high-fidelity latents but suffer inherent flaws: the quantizer is non-differentiable, requires straight-through hacks, and is prone to collapse. We address these issues at the root by replacing VQ with…

Machine Learning · Computer Science 2026-02-24 Hao Lu , Onur C. Koyun , Yongxin Guo , Zhengjie Zhu , Abbas Alili , Metin Nafi Gurcan