Related papers: Group-Wise Optimization for Self-Extensible Codebo…

MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization

Vector Quantized Variational Autoencoders (VQ-VAEs) are fundamental models that compress continuous visual data into discrete tokens. Existing methods have tried to improve the quantization strategy for better reconstruction quality,…

Computer Vision and Pattern Recognition · Computer Science 2025-07-15 Mingkai Jia , Wei Yin , Xiaotao Hu , Jiaxin Guo , Xiaoyang Guo , Qian Zhang , Xiao-Xiao Long , Ping Tan

Online Clustered Codebook

Vector Quantisation (VQ) is experiencing a comeback in machine learning, where it is increasingly used in representation learning. However, optimizing the codevectors in existing VQ-VAE is not entirely trivial. A problem is codebook…

Computer Vision and Pattern Recognition · Computer Science 2023-07-31 Chuanxia Zheng , Andrea Vedaldi

Learning Product Codebooks using Vector Quantized Autoencoders for Image Retrieval

Vector-Quantized Variational Autoencoders (VQ-VAE)[1] provide an unsupervised model for learning discrete representations by combining vector quantization and autoencoders. In this paper, we study the use of VQ-VAE for representation…

Image and Video Processing · Electrical Eng. & Systems 2019-03-05 Hanwei Wu , Markus Flierl

Restructuring Vector Quantization with the Rotation Trick

Vector Quantized Variational AutoEncoders (VQ-VAEs) are designed to compress a continuous input to a discrete latent space and reconstruct it with minimal distortion. They operate by maintaining a set of vectors -- often referred to as the…

Machine Learning · Computer Science 2025-03-18 Christopher Fifty , Ronald G. Junkins , Dennis Duan , Aniketh Iyengar , Jerry W. Liu , Ehsan Amid , Sebastian Thrun , Christopher Ré

Dual Codebook VQ: Enhanced Image Reconstruction with Reduced Codebook Size

Vector Quantization (VQ) techniques face significant challenges in codebook utilization, limiting reconstruction fidelity in image modeling. We introduce a Dual Codebook mechanism that effectively addresses this limitation by partitioning…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 Parisa Boodaghi Malidarreh , Jillur Rahman Saurav , Thuong Le Hoai Pham , Amir Hajighasemi , Anahita Samadi , Saurabh Shrinivas Maydeo , Mohammad Sadegh Nasr , Jacob M. Luber

SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

One noted issue of vector-quantized variational autoencoder (VQ-VAE) is that the learned discrete representation uses only a fraction of the full capacity of the codebook, also known as codebook collapse. We hypothesize that the training…

Machine Learning · Computer Science 2022-06-10 Yuhta Takida , Takashi Shibuya , WeiHsiang Liao , Chieh-Hsin Lai , Junki Ohmura , Toshimitsu Uesaka , Naoki Murata , Shusuke Takahashi , Toshiyuki Kumakura , Yuki Mitsufuji

Is Hierarchical Quantization Essential for Optimal Reconstruction?

Vector-quantized variational autoencoders (VQ-VAEs) are central to models that rely on high reconstruction fidelity, from neural compression to generative pipelines. Hierarchical extensions, such as VQ-VAE2, are often credited with superior…

Computer Vision and Pattern Recognition · Computer Science 2026-03-20 Shirin Reyhanian , Laurenz Wiskott

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical…

Machine Learning · Computer Science 2024-03-29 Yuhta Takida , Yukara Ikemiya , Takashi Shibuya , Kazuki Shimada , Woosung Choi , Chieh-Hsin Lai , Naoki Murata , Toshimitsu Uesaka , Kengo Uchida , Wei-Hsiang Liao , Yuki Mitsufuji

Robust Vector Quantized-Variational Autoencoder

Image generative models can learn the distributions of the training data and consequently generate examples by sampling from these distributions. However, when the training dataset is corrupted with outliers, generative models will likely…

Machine Learning · Computer Science 2022-09-21 Chieh-Hsin Lai , Dongmian Zou , Gilad Lerman

LG-VQ: Language-Guided Codebook Learning

Vector quantization (VQ) is a key technique in high-resolution and high-fidelity image synthesis, which aims to learn a codebook to encode an image with a sequence of discrete codes and then generate an image in an auto-regression manner.…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Guotao Liang , Baoquan Zhang , Yaowei Wang , Xutao Li , Yunming Ye , Huaibin Wang , Chuyao Luo , Kola Ye , linfeng Luo

Robust Training of Vector Quantized Bottleneck Models

In this paper we demonstrate methods for reliable and efficient training of discrete representation using Vector-Quantized Variational Auto-Encoder models (VQ-VAEs). Discrete latent variable models have been shown to learn nontrivial…

Machine Learning · Computer Science 2024-09-13 Adrian Łańcucki , Jan Chorowski , Guillaume Sanchez , Ricard Marxer , Nanxin Chen , Hans J. G. A. Dolfing , Sameer Khurana , Tanel Alumäe , Antoine Laurent

VP-VAE: Rethinking Vector Quantization via Adaptive Vector Perturbation

Vector Quantized Variational Autoencoders (VQ-VAEs) are fundamental to modern generative modeling, yet they often suffer from training instability and "codebook collapse" due to the inherent coupling of representation learning and discrete…

Machine Learning · Computer Science 2026-02-20 Linwei Zhai , Han Ding , Mingzhi Lin , Cui Zhao , Fei Wang , Ge Wang , Wang Zhi , Wei Xi

Gaussian Mixture Vector Quantization with Aggregated Categorical Posterior

The vector quantization is a widely used method to map continuous representation to discrete space and has important application in tokenization for generative mode, bottlenecking information and many other tasks in machine learning. Vector…

Machine Learning · Computer Science 2024-10-15 Mingyuan Yan , Jiawei Wu , Rushi Shah , Dianbo Liu

Training-Free Vector Quantization via Gaussian VAEs

Vector-quantized variational autoencoders (VQ-VAEs) are discrete autoencoders that compress images into discrete tokens. However, they are difficult to train due to discretization. In this paper, we propose a simple yet effective technique…

Machine Learning · Computer Science 2026-05-27 Tongda Xu , Wendi Zheng , Jiajun He , Jose Miguel Hernandez-Lobato , Yan Wang , Ya-Qin Zhang , Jie Tang

Scalable Image Tokenization with Index Backpropagation Quantization

Existing vector quantization (VQ) methods struggle with scalability, largely attributed to the instability of the codebook that undergoes partial updates during training. The codebook is prone to collapse as utilization decreases, due to…

Computer Vision and Pattern Recognition · Computer Science 2025-03-11 Fengyuan Shi , Zhuoyan Luo , Yixiao Ge , Yujiu Yang , Ying Shan , Limin Wang

Hierarchical Vector Quantized Graph Autoencoder with Annealing-Based Code Selection

Graph self-supervised learning has gained significant attention recently. However, many existing approaches heavily depend on perturbations, and inappropriate perturbations may corrupt the graph's inherent information. The Vector Quantized…

Machine Learning · Computer Science 2025-04-18 Long Zeng , Jianxiang Yu , Jiapeng Zhu , Qingsong Zhong , Xiang Li

ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs

Current neural audio codecs typically use residual vector quantization (RVQ) to discretize speech signals. However, they often experience codebook collapse, which reduces the effective codebook size and leads to suboptimal performance. To…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-12 Rui-Chen Zheng , Hui-Peng Du , Xiao-Hang Jiang , Yang Ai , Zhen-Hua Ling

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm that first learns a codebook to encode images as discrete codes, and then completes generation based on the learned codebook. However, they…

Computer Vision and Pattern Recognition · Computer Science 2023-05-22 Mengqi Huang , Zhendong Mao , Zhuowei Chen , Yongdong Zhang

VAEVQ: Enhancing Discrete Visual Tokenization through Variational Modeling

Vector quantization (VQ) transforms continuous image features into discrete representations, providing compressed, tokenized inputs for generative models. However, VQ-based frameworks suffer from several issues, such as non-smooth latent…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Sicheng Yang , Xing Hu , Qiang Wu , Dawei Yang

Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization

Vector Quantization (VQ) is a well-known technique in deep learning for extracting informative discrete latent representations. VQ-embedded models have shown impressive results in a range of applications including image and speech…

Machine Learning · Computer Science 2023-10-05 Tanmay Gautam , Reid Pryzant , Ziyi Yang , Chenguang Zhu , Somayeh Sojoudi