Related papers: Masked Vector Quantization

MVQ:Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization

Vector quantization(VQ) is a hardware-friendly DNN compression method that can reduce the storage cost and weight-loading datawidth of hardware accelerators. However, conventional VQ techniques lead to significant accuracy loss because the…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Shuaiting Li , Chengxuan Wang , Juncan Deng , Zeyu Wang , Zewen Ye , Zongsheng Wang , Haibin Shen , Kejie Huang

Variational Masked Diffusion Models

Masked diffusion models have recently emerged as a flexible framework for discrete generative modeling. However, a key limitation of standard masked diffusion is its inability to effectively capture dependencies among tokens that are…

Machine Learning · Computer Science 2025-10-28 Yichi Zhang , Alex Schwing , Zhizhen Zhao

VAEVQ: Enhancing Discrete Visual Tokenization through Variational Modeling

Vector quantization (VQ) transforms continuous image features into discrete representations, providing compressed, tokenized inputs for generative models. However, VQ-based frameworks suffer from several issues, such as non-smooth latent…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Sicheng Yang , Xing Hu , Qiang Wu , Dawei Yang

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Masked Image Modeling (MIM) with Vector Quantization (VQ) has achieved great success in both self-supervised pre-training and image generation. However, most existing methods struggle to address the trade-off in shared latent space for…

Computer Vision and Pattern Recognition · Computer Science 2025-04-02 Siyuan Li , Luyuan Zhang , Zedong Wang , Juanxi Tian , Cheng Tan , Zicheng Liu , Chang Yu , Qingsong Xie , Haonan Lu , Haoqian Wang , Zhen Lei

MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization

Vector Quantized Variational Autoencoders (VQ-VAEs) are fundamental models that compress continuous visual data into discrete tokens. Existing methods have tried to improve the quantization strategy for better reconstruction quality,…

Computer Vision and Pattern Recognition · Computer Science 2025-07-15 Mingkai Jia , Wei Yin , Xiaotao Hu , Jiaxin Guo , Xiaoyang Guo , Qian Zhang , Xiao-Xiao Long , Ping Tan

Learning Low-Rank Representations for Model Compression

Vector Quantization (VQ) is an appealing model compression method to obtain a tiny model with less accuracy loss. While methods to obtain better codebooks and codes under fixed clustering dimensionality have been extensively studied,…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Zezhou Zhu , Yucong Zhou , Zhao Zhong

MGVQ: Synergizing Multi-dimensional Sensitivity-Aware and Gradient-Hessian Fusion for Vector Quantization

Vision-Language Models (VLMs) achieve outstanding performance, yet their huge model size severely hinders deployment on edge devices with limited resources. As an efficient model compression technique, vector quantization (VQ) excels in…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Zhong Wang , Zukang Xu , Xing Hu , Dawei Yang

Regularized Vector Quantization for Tokenized Image Synthesis

Quantizing images into discrete representations has been a fundamental problem in unified generative modeling. Predominant approaches learn the discrete representation either in a deterministic manner by selecting the best-matching token or…

Computer Vision and Pattern Recognition · Computer Science 2023-10-17 Jiahui Zhang , Fangneng Zhan , Christian Theobalt , Shijian Lu

MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation

Although two-stage Vector Quantized (VQ) generative models allow for synthesizing high-fidelity and high-resolution images, their quantization operator encodes similar patches within an image into the same index, resulting in a repeated…

Computer Vision and Pattern Recognition · Computer Science 2022-09-20 Chuanxia Zheng , Long Tung Vuong , Jianfei Cai , Dinh Phung

Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation

Knowledge distillation(KD) is a common approach to improve model performance in automatic speech recognition (ASR), where a student model is trained to imitate the output behaviour of a teacher model. However, traditional KD methods suffer…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-02 Liyong Guo , Xiaoyu Yang , Quandong Wang , Yuxiang Kong , Zengwei Yao , Fan Cui , Fangjun Kuang , Wei Kang , Long Lin , Mingshuang Luo , Piotr Zelasko , Daniel Povey

VEQ: Modality-Adaptive Quantization for MoE Vision-Language Models

Mixture-of-Experts(MoE) Vision-Language Models (VLMs) offer remarkable performance but incur prohibitive memory and computational costs, making compression essential. Post-Training Quantization (PTQ) is an effective training-free technique…

Computer Vision and Pattern Recognition · Computer Science 2026-02-03 Guangshuo Qin , Zhiteng Li , Zheng Chen , Weihang Zhang , Linghe Kong , Yulun Zhang

Mitigating Premature Discretization with Progressive Quantization for Robust Vector Tokenization

Vector Quantization (VQ) has become the cornerstone of tokenization for many multimodal Large Language Models and diffusion synthesis. However, existing VQ paradigms suffer from a fundamental conflict: they enforce discretization before the…

Machine Learning · Computer Science 2026-03-25 Wenhao Zhao , Qiran Zou , Zhouhan Lin , Dianbo Liu

Scalable Image Tokenization with Index Backpropagation Quantization

Existing vector quantization (VQ) methods struggle with scalability, largely attributed to the instability of the codebook that undergoes partial updates during training. The codebook is prone to collapse as utilization decreases, due to…

Computer Vision and Pattern Recognition · Computer Science 2025-03-11 Fengyuan Shi , Zhuoyan Luo , Yixiao Ge , Yujiu Yang , Ying Shan , Limin Wang

KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models

Mixture of Experts (MoE) models have achieved great success by significantly improving performance while maintaining computational efficiency through sparse expert activation. However, their enormous parameter sizes and memory demands pose…

Machine Learning · Computer Science 2026-02-25 Zukang Xu , Zhixiong Zhao , Xing Hu , Zhixuan Chen , Dawei Yang

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical…

Machine Learning · Computer Science 2024-03-29 Yuhta Takida , Yukara Ikemiya , Takashi Shibuya , Kazuki Shimada , Woosung Choi , Chieh-Hsin Lai , Naoki Murata , Toshimitsu Uesaka , Kengo Uchida , Wei-Hsiang Liao , Yuki Mitsufuji

Efficient Generative Modeling with Residual Vector Quantization-Based Tokens

We introduce ResGen, an efficient Residual Vector Quantization (RVQ)-based generative model for high-fidelity generation with fast sampling. RVQ improves data fidelity by increasing the number of quantization steps, referred to as depth,…

Machine Learning · Computer Science 2025-06-03 Jaehyeon Kim , Taehong Moon , Keon Lee , Jaewoong Cho

Masked Frequency Modeling for Self-Supervised Visual Pre-Training

We present Masked Frequency Modeling (MFM), a unified frequency-domain-based approach for self-supervised pre-training of visual models. Instead of randomly inserting mask tokens to the input embeddings in the spatial domain, in this paper,…

Computer Vision and Pattern Recognition · Computer Science 2023-04-26 Jiahao Xie , Wei Li , Xiaohang Zhan , Ziwei Liu , Yew Soon Ong , Chen Change Loy

VQKV: High-Fidelity and High-Ratio Cache Compression via Vector-Quantization

The growing context length of Large Language Models (LLMs) enlarges the Key-Value (KV) cache, limiting deployment in resource-limited environments. Prior training-free approaches for KV cache compression typically rely on low-rank…

Computation and Language · Computer Science 2026-03-18 Yixuan Wang , Qingyu Shi , Jiayu Zhou , Dianbo Liu , Ziwei He , Zhouhan Lin

Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Vector Quantization (VQ) is essential for discretizing continuous representations in unsupervised learning but suffers from representation collapse, causing low codebook utilization and limiting scalability. Existing solutions often rely on…

Machine Learning · Computer Science 2025-10-06 Yongxin Zhu , Bocheng Li , Yifei Xin , Zhihua Xia , Linli Xu

LG-VQ: Language-Guided Codebook Learning

Vector quantization (VQ) is a key technique in high-resolution and high-fidelity image synthesis, which aims to learn a codebook to encode an image with a sequence of discrete codes and then generate an image in an auto-regression manner.…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Guotao Liang , Baoquan Zhang , Yaowei Wang , Xutao Li , Yunming Ye , Huaibin Wang , Chuyao Luo , Kola Ye , linfeng Luo