Related papers: Bolt: Accelerated Data Mining with Fast Vector Com…

MVQ:Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization

Vector quantization(VQ) is a hardware-friendly DNN compression method that can reduce the storage cost and weight-loading datawidth of hardware accelerators. However, conventional VQ techniques lead to significant accuracy loss because the…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Shuaiting Li , Chengxuan Wang , Juncan Deng , Zeyu Wang , Zewen Ye , Zongsheng Wang , Haibin Shen , Kejie Huang

Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search

Approximate nearest neighbor search for vectors relies on indexes that are most often accessed from RAM. Therefore, storage is the factor limiting the size of the database that can be served from a machine. Lossy vector compression, i.e.,…

Machine Learning · Computer Science 2025-01-22 Daniel Severo , Giuseppe Ottaviano , Matthew Muckley , Karen Ullrich , Matthijs Douze

Compressing Deep Convolutional Networks using Vector Quantization

Deep convolutional neural networks (CNN) has become the most promising method for object recognition, repeatedly demonstrating record breaking results for image classification and object detection in recent years. However, a very deep CNN…

Computer Vision and Pattern Recognition · Computer Science 2014-12-22 Yunchao Gong , Liu Liu , Ming Yang , Lubomir Bourdev

Nowadays, data is represented by vectors. Retrieving those vectors, among millions and billions, that are similar to a given query is a ubiquitous problem, known as similarity search, of relevance for a wide range of applications.…

Machine Learning · Computer Science 2023-07-26 Cecilia Aguerrebere , Ishwar Bhati , Mark Hildebrand , Mariano Tepper , Ted Willke

Ultra-Quantisation: Efficient Embedding Search via 1.58-bit Encodings

Many modern search domains comprise high-dimensional vectors of floating point numbers derived from neural networks, in the form of embeddings. Typical embeddings range in size from hundreds to thousands of dimensions, making the size of…

Machine Learning · Computer Science 2025-06-03 Richard Connor , Alan Dearle , Ben Claydon

VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization

Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs. However, the trade-off between the quantization bitwidth and final accuracy is complex and non-convex, which makes it difficult…

Computer Vision and Pattern Recognition · Computer Science 2020-06-11 Cheng Gong , Yao Chen , Ye Lu , Tao Li , Cong Hao , Deming Chen

Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks

Compressing large neural networks is an important step for their deployment in resource-constrained computational platforms. In this context, vector quantization is an appealing framework that expresses multiple parameters using a single…

Computer Vision and Pattern Recognition · Computer Science 2021-04-13 Julieta Martinez , Jashan Shewakramani , Ting Wei Liu , Ioan Andrei Bârsan , Wenyuan Zeng , Raquel Urtasun

Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks

Vector quantization is a fundamental technique for compression and large-scale nearest neighbor search. For high-accuracy operating points, multi-codebook quantization associates data vectors with one element from each of multiple…

Machine Learning · Computer Science 2025-01-08 Théophane Vallaeys , Matthew Muckley , Jakob Verbeek , Matthijs Douze

Accelerated Distance Computation with Encoding Tree for High Dimensional Data

We propose a novel distance to calculate distance between high dimensional vector pairs, utilizing vector quantization generated encodings. Vector quantization based methods are successful in handling large scale high dimensional data.…

Computer Vision and Pattern Recognition · Computer Science 2015-09-21 Shicong Liu , Junru Shao , Hongtao Lu

Low-Precision Quantization for Efficient Nearest Neighbor Search

Fast k-Nearest Neighbor search over real-valued vector spaces (KNN) is an important algorithmic task for information retrieval and recommendation systems. We present a method for using reduced precision to represent vectors through…

Information Retrieval · Computer Science 2021-10-19 Anthony Ko , Iman Keivanloo , Vihan Lakshman , Eric Schkufza

Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search

Approximate nearest neighbor (ANN) query in high-dimensional Euclidean space is a key operator in database systems. For this query, quantization is a popular family of methods developed for compressing vectors and reducing memory…

Databases · Computer Science 2024-09-17 Jianyang Gao , Yutong Gou , Yuexuan Xu , Yongyi Yang , Cheng Long , Raymond Chi-Wing Wong

Vector Quantization for Machine Vision

This paper shows how to reduce the computational cost for a variety of common machine vision tasks by operating directly in the compressed domain, particularly in the context of hardware acceleration. Pyramid Vector Quantization (PVQ) is…

Computer Vision and Pattern Recognition · Computer Science 2016-03-31 Vincenzo Liguori

Individualized non-uniform quantization for vector search

Embedding vectors are widely used for representing unstructured data and searching through it for semantically similar items. However, the large size of these vectors, due to their high-dimensionality, creates problems for modern vector…

Machine Learning · Computer Science 2025-09-24 Mariano Tepper , Ted Willke

Accelerate Support Vector Clustering via Spectrum-Preserving Data Compression

This paper proposes a novel framework for accelerating support vector clustering. The proposed method first computes much smaller compressed data sets while preserving the key cluster properties of the original data sets based on a novel…

Machine Learning · Computer Science 2023-05-16 Yuxuan Song , Yongyu Wang

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Quantization enables efficient acceleration of deep neural networks by reducing model memory footprint and exploiting low-cost integer math hardware units. Quantization maps floating-point weights and activations in a trained model to…

Machine Learning · Computer Science 2021-02-11 Steve Dai , Rangharajan Venkatesan , Haoxing Ren , Brian Zimmer , William J. Dally , Brucek Khailany

Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization

Vector quantization, which discretizes a continuous vector space into a finite set of representative vectors (a codebook), has been widely adopted in modern machine learning. Despite its effectiveness, vector quantization poses a…

Machine Learning · Computer Science 2026-01-30 Takashi Morita

Decoding billions of integers per second through vectorization

In many important applications -- such as search engines and relational database systems -- data is stored in the form of arrays of integers. Encoding and, most importantly, decoding of these arrays consumes considerable CPU time.…

Information Retrieval · Computer Science 2021-02-02 Daniel Lemire , Leonid Boytsov

Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey

Vision Transformers (ViTs) have recently garnered considerable attention, emerging as a promising alternative to convolutional neural networks (CNNs) in several vision-related applications. However, their large model sizes and high…

Machine Learning · Computer Science 2024-05-02 Dayou Du , Gu Gong , Xiaowen Chu

Accelerating Competitive Learning Graph Quantization

Vector quantization(VQ) is a lossy data compression technique from signal processing for which simple competitive learning is one standard method to quantize patterns from the input space. Extending competitive learning VQ to the domain of…

Computer Vision and Pattern Recognition · Computer Science 2010-01-07 Brijnesh J. Jain , Klaus Obermayer

Interleaved Composite Quantization for High-Dimensional Similarity Search

Similarity search retrieves the nearest neighbors of a query vector from a dataset of high-dimensional vectors. As the size of the dataset grows, the cost of performing the distance computations needed to implement a query can become…

Machine Learning · Computer Science 2019-12-20 Soroosh Khoram , Stephen J Wright , Jing Li