Related papers: Differentiable Product Quantization for End-to-End…

Beyond Product Quantization: Deep Progressive Quantization for Image Retrieval

Product Quantization (PQ) has long been a mainstream for generating an exponentially large codebook at very low memory/time cost. Despite its success, PQ is still tricky for the decomposition of high-dimensional vector space, and the…

Computer Vision and Pattern Recognition · Computer Science 2020-12-08 Lianli Gao , Xiaosu Zhu , Jingkuan Song , Zhou Zhao , Heng Tao Shen

End-to-End Supervised Product Quantization for Image Search and Retrieval

Product Quantization, a dictionary based hashing method, is one of the leading unsupervised hashing techniques. While it ignores the labels, it harnesses the features to construct look up tables that can approximate the feature space. In…

Computer Vision and Pattern Recognition · Computer Science 2020-01-22 Benjamin Klein , Lior Wolf

Differentiable Model Compression via Pseudo Quantization Noise

We propose DiffQ a differentiable method for model compression for quantizing model parameters without gradient approximations (e.g., Straight Through Estimator). We suggest adding independent pseudo quantization noise to model parameters…

Machine Learning · Statistics 2022-10-18 Alexandre Défossez , Yossi Adi , Gabriel Synnaeve

Embedding Compression with Isotropic Iterative Quantization

Continuous representation of words is a standard component in deep learning-based NLP models. However, representing a large vocabulary requires significant memory, which can cause problems, particularly on resource-constrained platforms.…

Computation and Language · Computer Science 2020-01-24 Siyu Liao , Jie Chen , Yanzhi Wang , Qinru Qiu , Bo Yuan

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution

Model quantization is challenging due to many tedious hyper-parameters such as precision (bitwidth), dynamic range (minimum and maximum discrete values) and stepsize (interval between discrete values). Unlike prior arts that carefully tune…

Machine Learning · Computer Science 2021-07-08 Zhang Zhaoyang , Shao Wenqi , Gu Jinwei , Wang Xiaogang , Luo Ping

Compression of Deep Neural Networks on the Fly

Thanks to their state-of-the-art performance, deep neural networks are increasingly used for object recognition. To achieve these results, they use millions of parameters to be trained. However, when targeting embedded applications the size…

Machine Learning · Computer Science 2016-03-21 Guillaume Soulié , Vincent Gripon , Maëlys Robert

Dataset Color Quantization: A Training-Oriented Framework for Dataset-Level Compression

Large-scale image datasets are fundamental to deep learning, but their high storage demands pose challenges for deployment in resource-constrained environments. While existing approaches reduce dataset size by discarding samples, they often…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Chenyue Yu , Lingao Xiao , Jinhong Deng , Ivor W. Tsang , Yang He

BPQP: A Differentiable Convex Optimization Framework for Efficient End-to-End Learning

Data-driven decision-making processes increasingly utilize end-to-end learnable deep neural networks to render final decisions. Sometimes, the output of the forward functions in certain layers is determined by the solutions to mathematical…

Machine Learning · Computer Science 2024-12-31 Jianming Pan , Zeqi Ye , Xiao Yang , Xu Yang , Weiqing Liu , Lewen Wang , Jiang Bian

Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations

We present a new approach to learn compressible representations in deep architectures with an end-to-end training strategy. Our method is based on a soft (continuous) relaxation of quantization and entropy, which we anneal to their discrete…

Machine Learning · Computer Science 2017-06-09 Eirikur Agustsson , Fabian Mentzer , Michael Tschannen , Lukas Cavigelli , Radu Timofte , Luca Benini , Luc Van Gool

Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks

Hardware-friendly network quantization (e.g., binary/uniform quantization) can efficiently accelerate the inference and meanwhile reduce memory consumption of the deep neural networks, which is crucial for model deployment on…

Computer Vision and Pattern Recognition · Computer Science 2019-08-15 Ruihao Gong , Xianglong Liu , Shenghu Jiang , Tianxiang Li , Peng Hu , Jiazhen Lin , Fengwei Yu , Junjie Yan

EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise Optimization

Quantization is a key method for deploying deep neural networks on edge devices with limited memory and computation resources. Recent improvements in Post-Training Quantization (PTQ) methods were achieved by an additional local optimization…

Computer Vision and Pattern Recognition · Computer Science 2024-09-27 Ofir Gordon , Elad Cohen , Hai Victor Habi , Arnon Netzer

SDQ: Stochastic Differentiable Quantization with Mixed Precision

In order to deploy deep models in a computationally efficient manner, model quantization approaches have been frequently used. In addition, as new hardware that supports mixed bitwidth arithmetic operations, recent research on mixed…

Machine Learning · Computer Science 2022-07-12 Xijie Huang , Zhiqiang Shen , Shichao Li , Zechun Liu , Xianghong Hu , Jeffry Wicaksana , Eric Xing , Kwang-Ting Cheng

End-to-End Facial Deep Learning Feature Compression with Teacher-Student Enhancement

In this paper, we propose a novel end-to-end feature compression scheme by leveraging the representation and learning capability of deep neural networks, towards intelligent front-end equipped analysis with promising accuracy and…

Computer Vision and Pattern Recognition · Computer Science 2020-02-11 Shurun Wang , Wenhan Yang , Shiqi Wang

Implicit Feature Decoupling with Depthwise Quantization

Quantization has been applied to multiple domains in Deep Neural Networks (DNNs). We propose Depthwise Quantization (DQ) where $\textit{quantization}$ is applied to a decomposed sub-tensor along the $\textit{feature axis}$ of weak…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Iordanis Fostiropoulos , Barry Boehm

Adaptive Dataset Quantization

Contemporary deep learning, characterized by the training of cumbersome neural networks on massive datasets, confronts substantial computational hurdles. To alleviate heavy data storage burdens on limited hardware resources, numerous…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Muquan Li , Dongyang Zhang , Qiang Dong , Xiurui Xie , Ke Qin

To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference

The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. Model compression techniques can address…

Machine Learning · Computer Science 2018-10-23 Qing Qin , Jie Ren , Jialong Yu , Ling Gao , Hai Wang , Jie Zheng , Yansong Feng , Jianbin Fang , Zheng Wang

Dual Precision Quantization for Efficient and Accurate Deep Neural Networks Inference

Deep neural networks have achieved state-of-the-art results in a wide range of applications, from natural language processing and computer vision to speech recognition. However, as tasks become increasingly complex, model sizes continue to…

Computer Vision and Pattern Recognition · Computer Science 2025-05-21 Tomer Gafni , Asaf Karnieli , Yair Hanani

Quality Scalable Quantization Methodology for Deep Learning on Edge

Deep Learning Architectures employ heavy computations and bulk of the computational energy is taken up by the convolution operations in the Convolutional Neural Networks. The objective of our proposed work is to reduce the energy…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-17 Salman Abdul Khaliq , Rehan Hafiz

Flexible Mixed Precision Quantization for Learned Image Compression

Despite its improvements in coding performance compared to traditional codecs, Learned Image Compression (LIC) suffers from large computational costs for storage and deployment. Model quantization offers an effective solution to reduce the…

Image and Video Processing · Electrical Eng. & Systems 2025-06-03 Md Adnan Faisal Hossain , Zhihao Duan , Fengqing Zhu

From Hours to Seconds: Towards 100x Faster Quantitative Phase Imaging via Differentiable Microscopy

With applications ranging from metabolomics to histopathology, quantitative phase microscopy (QPM) is a powerful label-free imaging modality. Despite significant advances in fast multiplexed imaging sensors and deep-learning-based inverse…

Image and Video Processing · Electrical Eng. & Systems 2023-10-10 Udith Haputhanthri , Kithmini Herath , Ramith Hettiarachchi , Hasindu Kariyawasam , Azeem Ahmad , Balpreet S. Ahluwalia , Chamira U. S. Edussooriya , Dushan N. Wadduwage