Related papers: Generative Low-bitwidth Data Free Quantization

GenQ: Quantization in Low Data Regimes with Generative Synthetic Data

In the realm of deep neural network deployment, low-bit quantization presents a promising avenue for enhancing computational efficiency. However, it often hinges on the availability of training data to mitigate quantization errors, a…

Computer Vision and Pattern Recognition · Computer Science 2024-09-18 Yuhang Li , Youngeun Kim , Donghyun Lee , Souvik Kundu , Priyadarshini Panda

Long-Range Zero-Shot Generative Deep Network Quantization

Quantization approximates a deep network model with floating-point numbers by the one with low bit width numbers, in order to accelerate inference and reduce computation. Quantizing a model without access to the original data, zero-shot…

Computer Vision and Pattern Recognition · Computer Science 2022-11-18 Yan Luo , Yangcheng Gao , Zhao Zhang , Haijun Zhang , Mingliang Xu , Meng Wang

Adaptive Data-Free Quantization

Data-free quantization (DFQ) recovers the performance of quantized network (Q) without the original data, but generates the fake sample via a generator (G) by learning from full-precision network (P), which, however, is totally independent…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Biao Qian , Yang Wang , Richang Hong , Meng Wang

Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization

To obtain lower inference latency and less memory footprint of deep neural networks, model quantization has been widely employed in deep model deployment, by converting the floating points to low-precision integers. However, previous…

Computer Vision and Pattern Recognition · Computer Science 2022-12-20 Yangcheng Gao , Zhao Zhang , Richang Hong , Haijun Zhang , Jicong Fan , Shuicheng Yan

Diverse Sample Generation: Pushing the Limit of Generative Data-free Quantization

Generative data-free quantization emerges as a practical compression approach that quantizes deep neural networks to low bit-width without accessing the real data. This approach generates data utilizing batch normalization (BN) statistics…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Haotong Qin , Yifu Ding , Xiangguo Zhang , Jiakai Wang , Xianglong Liu , Jiwen Lu

Data-Free Quantization Through Weight Equalization and Bias Correction

We introduce a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection. It achieves near-original model performance on common computer vision architectures and tasks. 8-bit…

Machine Learning · Computer Science 2019-11-26 Markus Nagel , Mart van Baalen , Tijmen Blankevoort , Max Welling

Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks

Hardware-friendly network quantization (e.g., binary/uniform quantization) can efficiently accelerate the inference and meanwhile reduce memory consumption of the deep neural networks, which is crucial for model deployment on…

Computer Vision and Pattern Recognition · Computer Science 2019-08-15 Ruihao Gong , Xianglong Liu , Shenghu Jiang , Tianxiang Li , Peng Hu , Jiazhen Lin , Fengwei Yu , Junjie Yan

Quantization of Generative Adversarial Networks for Efficient Inference: a Methodological Study

Generative adversarial networks (GANs) have an enormous potential impact on digital content creation, e.g., photo-realistic digital avatars, semantic content editing, and quality enhancement of speech and images. However, the performance of…

Artificial Intelligence · Computer Science 2021-09-01 Pavel Andreev , Alexander Fritzler , Dmitry Vetrov

MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity

Data-free quantization (DFQ) is a technique that creates a lightweight network from its full-precision counterpart without the original training data, often through a synthetic dataset. Although several DFQ methods have been proposed for…

Machine Learning · Computer Science 2025-04-15 Kanghyun Choi , Hye Yoon Lee , Dain Kwon , SunJong Park , Kyuyeun Kim , Noseong Park , Jonghyun Choi , Jinho Lee

Learnable Companding Quantization for Accurate Low-bit Neural Networks

Quantizing deep neural networks is an effective method for reducing memory consumption and improving inference speed, and is thus useful for implementation in resource-constrained devices. However, it is still hard for extremely low-bit…

Computer Vision and Pattern Recognition · Computer Science 2021-11-03 Kohei Yamamoto

Post-training Model Quantization Using GANs for Synthetic Data Generation

Quantization is a widely adopted technique for deep neural networks to reduce the memory and computational resources required. However, when quantized, most models would need a suitable calibration process to keep their performance intact,…

Computer Vision and Pattern Recognition · Computer Science 2023-05-11 Athanasios Masouris , Mansi Sharma , Adrian Boguszewski , Alexander Kozlov , Zhuo Wu , Raymond Lo

Data-Free Network Quantization With Adversarial Knowledge Distillation

Network quantization is an essential procedure in deep learning for development of efficient fixed-point inference models on mobile or edge platforms. However, as datasets grow larger and privacy regulations become stricter, data sharing…

Computer Vision and Pattern Recognition · Computer Science 2020-05-11 Yoojin Choi , Jihwan Choi , Mostafa El-Khamy , Jungwon Lee

DNQ: Dynamic Network Quantization

Network quantization is an effective method for the deployment of neural networks on memory and energy constrained mobile devices. In this paper, we propose a Dynamic Network Quantization (DNQ) framework which is composed of two modules: a…

Machine Learning · Computer Science 2018-12-07 Yuhui Xu , Shuai Zhang , Yingyong Qi , Jiaxian Guo , Weiyao Lin , Hongkai Xiong

Post-training Quantization for Neural Networks with Provable Guarantees

While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized…

Machine Learning · Computer Science 2023-01-18 Jinjie Zhang , Yixuan Zhou , Rayan Saab

Enhancing Generalization in Data-free Quantization via Mixup-class Prompting

Post-training quantization (PTQ) improves efficiency but struggles with limited calibration data, especially under privacy constraints. Data-free quantization (DFQ) mitigates this by generating synthetic images using generative models such…

Computer Vision and Pattern Recognition · Computer Science 2025-07-30 Jiwoong Park , Chaeun Lee , Yongseok Choi , Sein Park , Deokki Hong , Jungwook Choi

Defensive Quantization: When Efficiency Meets Robustness

Neural network quantization is becoming an industry standard to efficiently deploy deep learning models on hardware platforms, such as CPU, GPU, TPU, and FPGAs. However, we observe that the conventional quantization approaches are…

Machine Learning · Computer Science 2019-04-19 Ji Lin , Chuang Gan , Song Han

Hybrid and Non-Uniform quantization methods using retro synthesis data for efficient inference

Existing quantization aware training methods attempt to compensate for the quantization loss by leveraging on training data, like most of the post-training quantization methods, and are also time consuming. Both these methods are not…

Computer Vision and Pattern Recognition · Computer Science 2020-12-29 Tej pratap GVSL , Raja Kumar

DFQ-ViT: Data-Free Quantization for Vision Transformers without Fine-tuning

Data-Free Quantization (DFQ) enables the quantization of Vision Transformers (ViTs) without requiring access to data, allowing for the deployment of ViTs on devices with limited resources. In DFQ, the quantization model must be calibrated…

Computer Vision and Pattern Recognition · Computer Science 2025-07-22 Yujia Tong , Jingling Yuan , Tian Zhang , Jianquan Liu , Chuang Hu

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency,…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Kuan Wang , Zhijian Liu , Yujun Lin , Ji Lin , Song Han

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation

Quantization of deep neural networks (DNN) has been proven effective for compressing and accelerating DNN models. Data-free quantization (DFQ) is a promising approach without the original datasets under privacy-sensitive and confidential…

Machine Learning · Computer Science 2022-02-16 Cong Guo , Yuxian Qiu , Jingwen Leng , Xiaotian Gao , Chen Zhang , Yunxin Liu , Fan Yang , Yuhao Zhu , Minyi Guo