Related papers: LG-VQ: Language-Guided Codebook Learning

Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Image quantization is a crucial technique in image generation, aimed at learning a codebook that encodes an image into a discrete token sequence. Recent advancements have seen researchers exploring learning multi-modal codebook (i.e.,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-12 Guotao Liang , Baoquan Zhang , Zhiyuan Wen , Junteng Zhao , Yunming Ye , Kola Ye , Yao He

VAEVQ: Enhancing Discrete Visual Tokenization through Variational Modeling

Vector quantization (VQ) transforms continuous image features into discrete representations, providing compressed, tokenized inputs for generative models. However, VQ-based frameworks suffer from several issues, such as non-smooth latent…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Sicheng Yang , Xing Hu , Qiang Wu , Dawei Yang

Dual Codebook VQ: Enhanced Image Reconstruction with Reduced Codebook Size

Vector Quantization (VQ) techniques face significant challenges in codebook utilization, limiting reconstruction fidelity in image modeling. We introduce a Dual Codebook mechanism that effectively addresses this limitation by partitioning…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 Parisa Boodaghi Malidarreh , Jillur Rahman Saurav , Thuong Le Hoai Pham , Amir Hajighasemi , Anahita Samadi , Saurabh Shrinivas Maydeo , Mohammad Sadegh Nasr , Jacob M. Luber

LooC: Effective Low-Dimensional Codebook for Compositional Vector Quantization

Vector quantization (VQ) is a prevalent and fundamental technique that discretizes continuous feature vectors by approximating them using a codebook. As the diversity and complexity of data and models continue to increase, there is an…

Computer Vision and Pattern Recognition · Computer Science 2026-01-05 Jie Li , Kwan-Yee K. Wong , Kai Han

Scalable Training for Vector-Quantized Networks with 100% Codebook Utilization

Vector quantization (VQ) is a key component in discrete tokenizers for image generation, but its training is often unstable due to straight-through estimation bias, one-step-behind updates, and sparse codebook gradients, which lead to…

Computer Vision and Pattern Recognition · Computer Science 2025-09-15 Yifan Chang , Jie Qin , Limeng Qiao , Xiaofeng Wang , Zheng Zhu , Lin Ma , Xingang Wang

Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization

Vector Quantization (VQ) is a well-known technique in deep learning for extracting informative discrete latent representations. VQ-embedded models have shown impressive results in a range of applications including image and speech…

Machine Learning · Computer Science 2023-10-05 Tanmay Gautam , Reid Pryzant , Ziyi Yang , Chenguang Zhu , Somayeh Sojoudi

MGVQ: Synergizing Multi-dimensional Sensitivity-Aware and Gradient-Hessian Fusion for Vector Quantization

Vision-Language Models (VLMs) achieve outstanding performance, yet their huge model size severely hinders deployment on edge devices with limited resources. As an efficient model compression technique, vector quantization (VQ) excels in…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Zhong Wang , Zukang Xu , Xing Hu , Dawei Yang

Scalable Image Tokenization with Index Backpropagation Quantization

Existing vector quantization (VQ) methods struggle with scalability, largely attributed to the instability of the codebook that undergoes partial updates during training. The codebook is prone to collapse as utilization decreases, due to…

Computer Vision and Pattern Recognition · Computer Science 2025-03-11 Fengyuan Shi , Zhuoyan Luo , Yixiao Ge , Yujiu Yang , Ying Shan , Limin Wang

PCDVQ: Enhancing Vector Quantization for Large Language Models via Polar Coordinate Decoupling

Large Language Models (LLMs) face significant challenges in edge deployment due to their massive parameter scale. Vector Quantization (VQ), a clustering-based quantization method, serves as a prevalent solution to this issue for its…

Machine Learning · Computer Science 2025-06-27 Yuxuan Yue , Zukang Xu , Zhihang Yuan , Dawei Yang , Jianlong Wu , Liqiang Nie

Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression

It is customary to deploy uniform scalar quantization in the end-to-end optimized Neural image compression methods, instead of more powerful vector quantization, due to the high complexity of the latter. Lattice vector quantization (LVQ),…

Image and Video Processing · Electrical Eng. & Systems 2024-11-26 Xi Zhang , Xiaolin Wu

MOC-RVQ: Multilevel Codebook-Assisted Digital Generative Semantic Communication

Vector quantization-based image semantic communication systems have successfully boosted transmission efficiency, but face challenges with conflicting requirements between codebook design and digital constellation modulation. Traditional…

Computer Vision and Pattern Recognition · Computer Science 2024-10-01 Yingbin Zhou , Yaping Sun , Guanying Chen , Xiaodong Xu , Hao Chen , Binhong Huang , Shuguang Cui , Ping Zhang

SGC-VQGAN: Towards Complex Scene Representation via Semantic Guided Clustering Codebook

Vector quantization (VQ) is a method for deterministically learning features through discrete codebook representations. Recent works have utilized visual tokenizers to discretize visual regions for self-supervised representation learning.…

Computer Vision and Pattern Recognition · Computer Science 2024-09-11 Chenjing Ding , Chiyu Wang , Boshi Liu , Xi Guo , Weixuan Tang , Wei Wu

VEQ: Modality-Adaptive Quantization for MoE Vision-Language Models

Mixture-of-Experts(MoE) Vision-Language Models (VLMs) offer remarkable performance but incur prohibitive memory and computational costs, making compression essential. Post-Training Quantization (PTQ) is an effective training-free technique…

Computer Vision and Pattern Recognition · Computer Science 2026-02-03 Guangshuo Qin , Zhiteng Li , Zheng Chen , Weihang Zhang , Linghe Kong , Yulun Zhang

Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling

Vector-Quantized Image Modeling (VQIM) is a fundamental research problem in image synthesis, which aims to represent an image with a discrete token sequence. Existing studies effectively address this problem by learning a discrete codebook…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Baoquan Zhang , Huaibin Wang , Luo Chuyao , Xutao Li , Liang Guotao , Yunming Ye , Xiaochen Qi , Yao He

Learning Low-Rank Representations for Model Compression

Vector Quantization (VQ) is an appealing model compression method to obtain a tiny model with less accuracy loss. While methods to obtain better codebooks and codes under fixed clustering dimensionality have been extensively studied,…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Zezhou Zhu , Yucong Zhou , Zhao Zhong

LGQ: Learning Discretization Geometry for Scalable and Stable Image Tokenization

Discrete image tokenization is a key bottleneck for scalable visual generation: a tokenizer must remain compact for efficient latent-space priors while preserving semantic structure and using discrete capacity effectively. Existing…

Computer Vision and Pattern Recognition · Computer Science 2026-02-23 Idil Bilge Altun , Mert Onur Cakiroglu , Elham Buxton , Mehmet Dalkilic , Hasan Kurban

Improved Residual Vector Quantization for High-dimensional Approximate Nearest Neighbor Search

Quantization methods have been introduced to perform large scale approximate nearest search tasks. Residual Vector Quantization (RVQ) is one of the effective quantization methods. RVQ uses a multi-stage codebook learning scheme to lower the…

Computer Vision and Pattern Recognition · Computer Science 2015-09-18 Shicong Liu , Hongtao Lu , Junru Shao

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm that first learns a codebook to encode images as discrete codes, and then completes generation based on the learned codebook. However, they…

Computer Vision and Pattern Recognition · Computer Science 2023-05-22 Mengqi Huang , Zhendong Mao , Zhuowei Chen , Yongdong Zhang

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical…

Machine Learning · Computer Science 2024-03-29 Yuhta Takida , Yukara Ikemiya , Takashi Shibuya , Kazuki Shimada , Woosung Choi , Chieh-Hsin Lai , Naoki Murata , Toshimitsu Uesaka , Kengo Uchida , Wei-Hsiang Liao , Yuki Mitsufuji

Improving Test-Time Performance of RVQ-based Neural Codecs

The residual vector quantization (RVQ) technique plays a central role in recent advances in neural audio codecs. These models effectively synthesize high-fidelity audio from a limited number of codes due to the hierarchical structure among…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-24 Hyeongju Kim , Junhyeok Lee , Jacob Morton , Juheon Lee , Jinhyeok Yang