Related papers: Vector Quantization for Machine Vision

Pyramid Vector Quantization for Deep Learning

This paper explores the use of Pyramid Vector Quantization (PVQ) to reduce the computational cost for a variety of neural networks (NNs) while, at the same time, compressing the weights that describe them. This is based on the fact that the…

Machine Learning · Computer Science 2017-04-11 Vincenzo Liguori

MVQ:Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization

Vector quantization(VQ) is a hardware-friendly DNN compression method that can reduce the storage cost and weight-loading datawidth of hardware accelerators. However, conventional VQ techniques lead to significant accuracy loss because the…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Shuaiting Li , Chengxuan Wang , Juncan Deng , Zeyu Wang , Zewen Ye , Zongsheng Wang , Haibin Shen , Kejie Huang

Improving Pyramid Vector Quantizer with power projection

Pyramid Vector Quantizer (PVQ) is a promising technique especially for multimedia data compression, already used in Opus audio codec and considered for AV1 video codec. It quantizes vectors from Euclidean unit sphere by first projecting…

Optimization and Control · Mathematics 2017-05-16 Jarek Duda

Individualized non-uniform quantization for vector search

Embedding vectors are widely used for representing unstructured data and searching through it for semantically similar items. However, the large size of these vectors, due to their high-dimensionality, creates problems for modern vector…

Machine Learning · Computer Science 2025-09-24 Mariano Tepper , Ted Willke

Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization

Vector Quantization (VQ) is a well-known technique in deep learning for extracting informative discrete latent representations. VQ-embedded models have shown impressive results in a range of applications including image and speech…

Machine Learning · Computer Science 2023-10-05 Tanmay Gautam , Reid Pryzant , Ziyi Yang , Chenguang Zhu , Somayeh Sojoudi

Pyramid Vector Quantization and Bit Level Sparsity in Weights for Efficient Neural Networks Inference

This paper discusses three basic blocks for the inference of convolutional neural networks (CNNs). Pyramid Vector Quantization (PVQ) is discussed as an effective quantizer for CNNs weights resulting in highly sparse and compressible…

Computer Vision and Pattern Recognition · Computer Science 2019-11-26 Vincenzo Liguori

A Hybrid Quantum Encoding Algorithm of Vector Quantization for Image Compression

Many classical encoding algorithms of Vector Quantization (VQ) of image compression that can obtain global optimal solution have computational complexity O(N). A pure quantum VQ encoding algorithm with probability of success near 100% has…

Multimedia · Computer Science 2009-11-11 Chao-Yang Pang , Zheng-Wei Zhou , Guang-Can Guo

Learning Low-Rank Representations for Model Compression

Vector Quantization (VQ) is an appealing model compression method to obtain a tiny model with less accuracy loss. While methods to obtain better codebooks and codes under fixed clustering dimensionality have been extensively studied,…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Zezhou Zhu , Yucong Zhou , Zhao Zhong

Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks

Compressing large neural networks is an important step for their deployment in resource-constrained computational platforms. In this context, vector quantization is an appealing framework that expresses multiple parameters using a single…

Computer Vision and Pattern Recognition · Computer Science 2021-04-13 Julieta Martinez , Jashan Shewakramani , Ting Wei Liu , Ioan Andrei Bârsan , Wenyuan Zeng , Raquel Urtasun

SGC-VQGAN: Towards Complex Scene Representation via Semantic Guided Clustering Codebook

Vector quantization (VQ) is a method for deterministically learning features through discrete codebook representations. Recent works have utilized visual tokenizers to discretize visual regions for self-supervised representation learning.…

Computer Vision and Pattern Recognition · Computer Science 2024-09-11 Chenjing Ding , Chiyu Wang , Boshi Liu , Xi Guo , Weixuan Tang , Wei Wu

MGVQ: Synergizing Multi-dimensional Sensitivity-Aware and Gradient-Hessian Fusion for Vector Quantization

Vision-Language Models (VLMs) achieve outstanding performance, yet their huge model size severely hinders deployment on edge devices with limited resources. As an efficient model compression technique, vector quantization (VQ) excels in…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Zhong Wang , Zukang Xu , Xing Hu , Dawei Yang

Bolt: Accelerated Data Mining with Fast Vector Compression

Vectors of data are at the heart of machine learning and data mining. Recently, vector quantization methods have shown great promise in reducing both the time and space costs of operating on vectors. We introduce a vector quantization…

Performance · Computer Science 2017-07-03 Davis W Blalock , John V Guttag

Accelerating Competitive Learning Graph Quantization

Vector quantization(VQ) is a lossy data compression technique from signal processing for which simple competitive learning is one standard method to quantize patterns from the input space. Extending competitive learning VQ to the domain of…

Computer Vision and Pattern Recognition · Computer Science 2010-01-07 Brijnesh J. Jain , Klaus Obermayer

SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting

Vector Quantization (VQ) has emerged as a prominent weight compression technique, showcasing substantially lower quantization errors than uniform quantization across diverse models, particularly in extreme compression scenarios. However,…

Computer Vision and Pattern Recognition · Computer Science 2025-08-05 Shuaiting Li , Juncan Deng , Chenxuan Wang , Kedong Xu , Rongtao Deng , Hong Gu , Haibin Shen , Kejie Huang

Graph Quantization

Vector quantization(VQ) is a lossy data compression technique from signal processing, which is restricted to feature vectors and therefore inapplicable for combinatorial structures. This contribution presents a theoretical foundation of…

Artificial Intelligence · Computer Science 2010-01-07 Brijnesh J. Jain , Klaus Obermayer

Optimal and Near-Optimal Adaptive Vector Quantization

Quantization is a fundamental optimization for many machine-learning use cases, including compressing gradients, model weights and activations, and datasets. The most accurate form of quantization is \emph{adaptive}, where the error is…

Machine Learning · Computer Science 2025-08-01 Ran Ben-Basat , Yaniv Ben-Itzhak , Michael Mitzenmacher , Shay Vargaftik

Efficient VQ-QAT and Mixed Vector/Linear quantized Neural Networks

In this work, we developed and tested 3 techniques for vector quantization (VQ) based model weight compression. To mitigate codebook collapse and enable end-to-end training, we adopted cosine similarity-based assignment. Building on ideas…

Machine Learning · Computer Science 2026-04-28 Terry Gou , Puneet Gupta

VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization

Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs. However, the trade-off between the quantization bitwidth and final accuracy is complex and non-convex, which makes it difficult…

Computer Vision and Pattern Recognition · Computer Science 2020-06-11 Cheng Gong , Yao Chen , Ye Lu , Tao Li , Cong Hao , Deming Chen

Pyramid Vector Quantization for LLMs

Recent works on compression of large language models (LLM) using quantization considered reparameterizing the architecture such that weights are distributed on the sphere. This demonstratively improves the ability to quantize by increasing…

Machine Learning · Computer Science 2024-12-05 Tycho F. A. van der Ouderaa , Maximilian L. Croci , Agrin Hilmkil , James Hensman

Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey

Vision Transformers (ViTs) have recently garnered considerable attention, emerging as a promising alternative to convolutional neural networks (CNNs) in several vision-related applications. However, their large model sizes and high…

Machine Learning · Computer Science 2024-05-02 Dayou Du , Gu Gong , Xiaowen Chu