English
Related papers

Related papers: Pyramid Vector Quantization for Deep Learning

200 papers

This paper shows how to reduce the computational cost for a variety of common machine vision tasks by operating directly in the compressed domain, particularly in the context of hardware acceleration. Pyramid Vector Quantization (PVQ) is…

Computer Vision and Pattern Recognition · Computer Science 2016-03-31 Vincenzo Liguori

This paper discusses three basic blocks for the inference of convolutional neural networks (CNNs). Pyramid Vector Quantization (PVQ) is discussed as an effective quantizer for CNNs weights resulting in highly sparse and compressible…

Computer Vision and Pattern Recognition · Computer Science 2019-11-26 Vincenzo Liguori

Embedding vectors are widely used for representing unstructured data and searching through it for semantically similar items. However, the large size of these vectors, due to their high-dimensionality, creates problems for modern vector…

Machine Learning · Computer Science 2025-09-24 Mariano Tepper , Ted Willke

Recent works on compression of large language models (LLM) using quantization considered reparameterizing the architecture such that weights are distributed on the sphere. This demonstratively improves the ability to quantize by increasing…

Machine Learning · Computer Science 2024-12-05 Tycho F. A. van der Ouderaa , Maximilian L. Croci , Agrin Hilmkil , James Hensman

Pyramid Vector Quantizer (PVQ) is a promising technique especially for multimedia data compression, already used in Opus audio codec and considered for AV1 video codec. It quantizes vectors from Euclidean unit sphere by first projecting…

Optimization and Control · Mathematics 2017-05-16 Jarek Duda

Vector quantization(VQ) is a hardware-friendly DNN compression method that can reduce the storage cost and weight-loading datawidth of hardware accelerators. However, conventional VQ techniques lead to significant accuracy loss because the…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Shuaiting Li , Chengxuan Wang , Juncan Deng , Zeyu Wang , Zewen Ye , Zongsheng Wang , Haibin Shen , Kejie Huang

Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs. However, the trade-off between the quantization bitwidth and final accuracy is complex and non-convex, which makes it difficult…

Computer Vision and Pattern Recognition · Computer Science 2020-06-11 Cheng Gong , Yao Chen , Ye Lu , Tao Li , Cong Hao , Deming Chen

Operating deep neural networks (DNNs) on devices with limited resources requires the reduction of their memory as well as computational footprint. Popular reduction methods are network quantization or pruning, which either reduce the word…

Schmidt decomposition of a vector can be understood as writing the singular value decomposition (SVD) in vector form. A vector can be written as a linear combination of tensor product of two dimensional vectors by recursively applying…

Quantum Physics · Physics 2025-08-04 Ammar Daskin

Vector quantization (VQ) techniques are widely used in similarity search for data compression, fast metric computation and etc. Originally designed for Euclidean distance, existing VQ techniques (e.g., PQ, AQ) explicitly or implicitly…

Information Retrieval · Computer Science 2019-11-21 Xinyan Dai , Xiao Yan , Kelvin K. W. Ng , Jie Liu , James Cheng

Vector quantization (VQ) based ANN indexes, such as Inverted File System (IVF) and Product Quantization (PQ), have been widely applied to embedding based document retrieval thanks to the competitive time and memory efficiency. Originally,…

Information Retrieval · Computer Science 2022-04-29 Shitao Xiao , Zheng Liu , Weihao Han , Jianjin Zhang , Defu Lian , Yeyun Gong , Qi Chen , Fan Yang , Hao Sun , Yingxia Shao , Denvy Deng , Qi Zhang , Xing Xie

In this work, we developed and tested 3 techniques for vector quantization (VQ) based model weight compression. To mitigate codebook collapse and enable end-to-end training, we adopted cosine similarity-based assignment. Building on ideas…

Machine Learning · Computer Science 2026-04-28 Terry Gou , Puneet Gupta

Product Quantization (PQ) has long been a mainstream for generating an exponentially large codebook at very low memory/time cost. Despite its success, PQ is still tricky for the decomposition of high-dimensional vector space, and the…

Computer Vision and Pattern Recognition · Computer Science 2020-12-08 Lianli Gao , Xiaosu Zhu , Jingkuan Song , Zhou Zhao , Heng Tao Shen

Deep learning using neural networks is an effective technique for generating models of complex data. However, training such models can be expensive when networks have large model capacity resulting from a large number of layers and nodes.…

Machine Learning · Computer Science 2023-01-19 Jarom D. Hogue , Robert M. Kirby , Akil Narayan

Deep neural networks have achieved state-of-the art performance on various computer vision tasks. However, their deployment on resource-constrained devices has been hindered due to their high computational and storage complexity. While…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Hassan Dbouk , Hetul Sanghvi , Mahesh Mehendale , Naresh Shanbhag

Vector quantization is common in deep models, yet its hard assignments block gradients and hinder end-to-end training. We propose DiVeQ, which treats quantization as adding an error vector that mimics the quantization distortion, keeping…

Machine Learning · Computer Science 2026-05-27 Mohammad Hassan Vali , Tom Bäckström , Arno Solin

With the development of deep neural networks, the size of network models becomes larger and larger. Model compression has become an urgent need for deploying these network models to mobile or embedded devices. Model quantization is a…

Machine Learning · Computer Science 2019-07-02 Wen-Pu Cai , Wu-Jun Li

Vector Quantization (VQ) is an appealing model compression method to obtain a tiny model with less accuracy loss. While methods to obtain better codebooks and codes under fixed clustering dimensionality have been extensively studied,…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Zezhou Zhu , Yucong Zhou , Zhao Zhong

We present PQS, which uses three techniques together - Prune, Quantize, and Sort - to achieve low-bitwidth accumulation of dot products in neural network computations. In conventional quantized (e.g., 8-bit) dot products, partial results…

Machine Learning · Computer Science 2025-04-15 Vikas Natesh , H. T. Kung

In virtual reality (VR) applications, 360-degree images play a pivotal role in crafting immersive experiences and offering panoramic views, thus improving user Quality of Experience (QoE). However, the voluminous data generated by…

Image and Video Processing · Electrical Eng. & Systems 2024-06-10 Yang Ma , Wenchi Cheng , Jingqing Wang , Wei Zhang
‹ Prev 1 2 3 10 Next ›