Related papers: Quantized Feature Distillation for Network Quantiz…

QKD: Quantization-aware Knowledge Distillation

Quantization and Knowledge distillation (KD) methods are widely used to reduce memory and power consumption of deep neural networks (DNNs), especially for resource-constrained edge devices. Although their combination is quite promising to…

Computer Vision and Pattern Recognition · Computer Science 2019-12-02 Jangho Kim , Yash Bhalgat , Jinwon Lee , Chirag Patel , Nojun Kwak

Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks

The deep layers of modern neural networks extract a rather rich set of features as an input propagates through the network. This paper sets out to harvest these rich intermediate representations for quantization with minimal accuracy loss…

Machine Learning · Computer Science 2020-03-04 Ahmed T. Elthakeb , Prannoy Pilligundla , Alex Cloninger , Hadi Esmaeilzadeh

QUEST: Quantized embedding space for transferring knowledge

Knowledge distillation refers to the process of training a compact student network to achieve better accuracy by learning from a high capacity teacher network. Most of the existing knowledge distillation methods direct the student to follow…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Himalaya Jain , Spyros Gidaris , Nikos Komodakis , Patrick Pérez , Matthieu Cord

From Fewer Samples to Fewer Bits: Reframing Dataset Distillation as Joint Optimization of Precision and Compactness

Dataset Distillation (DD) compresses large datasets into compact synthetic ones that maintain training performance. However, current methods mainly target sample reduction, with limited consideration of data precision and its impact on…

Computer Vision and Pattern Recognition · Computer Science 2026-03-04 My H. Dinh , Aditya Sant , Akshay Malhotra , Keya Patani , Shahab Hamidi-Rad

Self-Supervised Quantization-Aware Knowledge Distillation

Quantization-aware training (QAT) and Knowledge Distillation (KD) are combined to achieve competitive performance in creating low-bit deep learning models. However, existing works applying KD to QAT require tedious hyper-parameter tuning to…

Machine Learning · Computer Science 2024-03-19 Kaiqi Zhao , Ming Zhao

Model compression via distillation and quantization

Deep neural networks (DNNs) continue to make significant advances, solving tasks from image classification to translation or reinforcement learning. One aspect of the field receiving considerable attention is efficiently executing deep…

Neural and Evolutionary Computing · Computer Science 2018-02-16 Antonio Polino , Razvan Pascanu , Dan Alistarh

Distilling Efficient Vision Transformers from CNNs for Semantic Segmentation

In this paper, we tackle a new problem: how to transfer knowledge from the pre-trained cumbersome yet well-performed CNN-based model to learn a compact Vision Transformer (ViT)-based model while maintaining its learning capacity? Due to the…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Xu Zheng , Yunhao Luo , Pengyuan Zhou , Lin Wang

FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Network quantization significantly reduces model inference complexity and has been widely used in real-world deployments. However, most existing quantization methods have been developed mainly on Convolutional Neural Networks (CNNs), and…

Computer Vision and Pattern Recognition · Computer Science 2023-02-20 Yang Lin , Tianyu Zhang , Peiqin Sun , Zheng Li , Shuchang Zhou

Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks

Knowledge distillation which learns a lightweight student model by distilling knowledge from a cumbersome teacher model is an attractive approach for learning compact deep neural networks (DNNs). Recent works further improve student network…

Computer Vision and Pattern Recognition · Computer Science 2022-10-31 Cuong Pham , Tuan Hoang , Thanh-Toan Do

Knowledge distillation for optimization of quantized deep neural networks

Knowledge distillation (KD) is a very popular method for model size reduction. Recently, the technique is exploited for quantized deep neural networks (QDNNs) training as a way to restore the performance sacrificed by word-length reduction.…

Machine Learning · Computer Science 2019-10-24 Sungho Shin , Yoonho Boo , Wonyong Sung

Poster: Self-Supervised Quantization-Aware Knowledge Distillation

Quantization-aware training (QAT) starts with a pre-trained full-precision model and performs quantization during retraining. However, existing QAT works require supervision from the labels and they suffer from accuracy loss due to reduced…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Kaiqi Zhao , Ming Zhao

Decoder-Free Distillation for Quantized Image Restoration

Quantization-Aware Training (QAT), combined with Knowledge Distillation (KD), holds immense promise for compressing models for edge deployment. However, joint optimization for precision-sensitive image restoration (IR) to recover visual…

Computer Vision and Pattern Recognition · Computer Science 2026-03-11 S. M. A. Sharif , Abdur Rehman , Seongwan Kim , Jaeho Lee

Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained Environments

Deep learning-based models are at the forefront of most driver observation benchmarks due to their remarkable accuracies but are also associated with high computational costs. This is challenging, as resources are often limited in…

Computer Vision and Pattern Recognition · Computer Science 2023-11-13 Calvin Tanama , Kunyu Peng , Zdravko Marinov , Rainer Stiefelhagen , Alina Roitberg

QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection

Multi-view 3D detection based on BEV (bird-eye-view) has recently achieved significant improvements. However, the huge memory consumption of state-of-the-art models makes it hard to deploy them on vehicles, and the non-trivial latency will…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Yifan Zhang , Zhen Dong , Huanrui Yang , Ming Lu , Cheng-Ching Tseng , Yuan Du , Kurt Keutzer , Li Du , Shanghang Zhang

AQD: Towards Accurate Fully-Quantized Object Detection

Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices. However, designing aggressively low-bit (e.g., 2-bit) quantization schemes on…

Computer Vision and Pattern Recognition · Computer Science 2024-02-23 Peng Chen , Jing Liu , Bohan Zhuang , Mingkui Tan , Chunhua Shen

Normalized Feature Distillation for Semantic Segmentation

As a promising approach in model compression, knowledge distillation improves the performance of a compact model by transferring the knowledge from a cumbersome one. The kind of knowledge used to guide the training of the student is…

Computer Vision and Pattern Recognition · Computer Science 2022-07-13 Tao Liu , Xi Yang , Chenshu Chen

Data-Augmented Quantization-Aware Knowledge Distillation

Quantization-aware training (QAT) and Knowledge Distillation (KD) are combined to achieve competitive performance in creating low-bit deep learning models. Existing KD and QAT works focus on improving the accuracy of quantized models from…

Machine Learning · Computer Science 2025-09-05 Justin Kur , Kaiqi Zhao

A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification

Recent advancements in machine learning achieved by Deep Neural Networks (DNNs) have been significant. While demonstrating high accuracy, DNNs are associated with a huge number of parameters and computations, which leads to high memory…

Machine Learning · Computer Science 2023-12-20 Babak Rokh , Ali Azarpeyvand , Alireza Khanteymoori

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery

This technical report presents quantization-aware distillation (QAD) and our best practices for recovering accuracy of NVFP4-quantized large language models (LLMs) and vision-language models (VLMs). QAD distills a full-precision teacher…

Machine Learning · Computer Science 2026-03-04 Meng Xin , Sweta Priyadarshi , Jingyu Xin , Bilal Kartal , Aditya Vavre , Asma Kuriparambil Thekkumpate , Zijia Chen , Ameya Sunil Mahabaleshwarkar , Ido Shahaf , Akhiad Bercovich , Kinjal Patel , Suguna Varshini Velury , Chenjie Luo , Zhiyu Cheng , Jenny Chen , Chen-Han Yu , Wei Ping , Oleg Rybakov , Nima Tajbakhsh , Oluwatobi Olabiyi , Dusan Stosic , Di Wu , Song Han , Eric Chung , Sharath Turuvekere Sreenivas , Bryan Catanzaro , Yoshi Suhara , Tijmen Blankevoort , Huizi Mao

Quantization Networks

Although deep neural networks are highly effective, their high computational and memory costs severely challenge their applications on portable devices. As a consequence, low-bit quantization, which converts a full-precision neural network…

Computer Vision and Pattern Recognition · Computer Science 2019-12-02 Jiwei Yang , Xu Shen , Jun Xing , Xinmei Tian , Houqiang Li , Bing Deng , Jianqiang Huang , Xiansheng Hua