Related papers: Model Compression Using Optimal Transport

Activation Map Adaptation for Effective Knowledge Distillation

Model compression becomes a recent trend due to the requirement of deploying neural networks on embedded and mobile devices. Hence, both accuracy and efficiency are of critical importance. To explore a balance between them, a knowledge…

Computer Vision and Pattern Recognition · Computer Science 2022-04-15 Zhiyuan Wu , Hong Qi , Yu Jiang , Minghao Zhao , Chupeng Cui , Zongmin Yang , Xinhui Xue

Deep Model Compression: Distilling Knowledge from Noisy Teachers

The remarkable successes of deep learning models across various applications have resulted in the design of deeper networks that can solve complex problems. However, the increasing depth of such models also results in a higher storage and…

Machine Learning · Computer Science 2016-11-03 Bharat Bhusan Sau , Vineeth N. Balasubramanian

An Efficient Method of Training Small Models for Regression Problems with Knowledge Distillation

Compressing deep neural network (DNN) models becomes a very important and necessary technique for real-world applications, such as deploying those models on mobile devices. Knowledge distillation is one of the most popular methods for model…

Machine Learning · Computer Science 2020-03-02 Makoto Takamoto , Yusuke Morishita , Hitoshi Imaoka

Learning Student Networks via Feature Embedding

Deep convolutional neural networks have been widely used in numerous applications, but their demanding storage and computational resource requirements prevent their applications on mobile devices. Knowledge distillation aims to optimize a…

Machine Learning · Computer Science 2018-12-18 Hanting Chen , Yunhe Wang , Chang Xu , Chao Xu , Dacheng Tao

Learning Metrics from Teachers: Compact Networks for Image Embedding

Metric learning networks are used to compute image embeddings, which are widely used in many applications such as image retrieval and face recognition. In this paper, we propose to use network distillation to efficiently compute image…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Lu Yu , Vacit Oguz Yazici , Xialei Liu , Joost van de Weijer , Yongmei Cheng , Arnau Ramisa

Knowledge Distillation: A Survey

In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. The great success of deep learning is mainly due to its scalability to encode large-scale data and to maneuver…

Machine Learning · Computer Science 2021-05-21 Jianping Gou , Baosheng Yu , Stephen John Maybank , Dacheng Tao

Model compression via distillation and quantization

Deep neural networks (DNNs) continue to make significant advances, solving tasks from image classification to translation or reinforcement learning. One aspect of the field receiving considerable attention is efficiently executing deep…

Neural and Evolutionary Computing · Computer Science 2018-02-16 Antonio Polino , Razvan Pascanu , Dan Alistarh

What is Lost in Knowledge Distillation?

Deep neural networks (DNNs) have improved NLP tasks significantly, but training and maintaining such networks could be costly. Model compression techniques, such as, knowledge distillation (KD), have been proposed to address the issue;…

Computation and Language · Computer Science 2023-11-08 Manas Mohanty , Tanya Roosta , Peyman Passban

Efficient Learned Image Compression Through Knowledge Distillation

Learned image compression sits at the intersection of machine learning and image processing. With advances in deep learning, neural network-based compression methods have emerged. In this process, an encoder maps the image to a…

Computer Vision and Pattern Recognition · Computer Science 2025-09-15 Fabien Allemand , Attilio Fiandrotti , Sumanta Chaudhuri , Alaa Eddine Mazouz

Learning from a Teacher using Unlabeled Data

Knowledge distillation is a widely used technique for model compression. We posit that the teacher model used in a distillation setup, captures relationships between classes, that extend beyond the original dataset. We empirically show that…

Machine Learning · Computer Science 2019-11-14 Gaurav Menghani , Sujith Ravi

Online Ensemble Model Compression using Knowledge Distillation

This paper presents a novel knowledge distillation based model compression framework consisting of a student ensemble. It enables distillation of simultaneously learnt ensemble knowledge onto each of the compressed student models. Each…

Computer Vision and Pattern Recognition · Computer Science 2020-11-17 Devesh Walawalkar , Zhiqiang Shen , Marios Savvides

Compact CNN Structure Learning by Knowledge Distillation

The concept of compressing deep Convolutional Neural Networks (CNNs) is essential to use limited computation, power, and memory resources on embedded devices. However, existing methods achieve this objective at the cost of a drop in…

Computer Vision and Pattern Recognition · Computer Science 2021-04-20 Waqar Ahmed , Andrea Zunino , Pietro Morerio , Vittorio Murino

Deep Face Recognition Model Compression via Knowledge Transfer and Distillation

Fully convolutional networks (FCNs) have become de facto tool to achieve very high-level performance for many vision and non-vision tasks in general and face recognition in particular. Such high-level accuracies are normally obtained by…

Computer Vision and Pattern Recognition · Computer Science 2019-06-04 Jayashree Karlekar , Jiashi Feng , Zi Sian Wong , Sugiri Pranata

Model compression using knowledge distillation with integrated gradients

Model compression is critical for deploying deep learning models on resource-constrained devices. We introduce a novel method enhancing knowledge distillation with integrated gradients (IG) as a data augmentation strategy. Our approach…

Computer Vision and Pattern Recognition · Computer Science 2025-06-18 David E. Hernandez , Jose Chang , Torbjörn E. M. Nordling

Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer

Deep neural network architectures have attained remarkable improvements in scene understanding tasks. Utilizing an efficient model is one of the most important constraints for limited-resource devices. Recently, several compression methods…

Computer Vision and Pattern Recognition · Computer Science 2020-10-12 Mahdi Ghorbani , Fahimeh Fooladgar , Shohreh Kasaei

Knowledge Distillation with the Reused Teacher Classifier

Knowledge distillation aims to compress a powerful yet cumbersome teacher model into a lightweight student model without much sacrifice of performance. For this purpose, various approaches have been proposed over the past few years,…

Computer Vision and Pattern Recognition · Computer Science 2022-03-29 Defang Chen , Jian-Ping Mei , Hailin Zhang , Can Wang , Yan Feng , Chun Chen

SlimNets: An Exploration of Deep Model Compression and Acceleration

Deep neural networks have achieved increasingly accurate results on a wide variety of complex tasks. However, much of this improvement is due to the growing use and availability of computational resources (e.g use of GPUs, more layers, more…

Machine Learning · Computer Science 2018-08-03 Ini Oguntola , Subby Olubeko , Christopher Sweeney

Extreme compression of sentence-transformer ranker models: faster inference, longer battery life, and less storage on edge devices

Modern search systems use several large ranker models with transformer architectures. These models require large computational resources and are not suitable for usage on devices with limited computational resources. Knowledge distillation…

Machine Learning · Computer Science 2022-07-27 Amit Chaulwar , Lukas Malik , Maciej Krajewski , Felix Reichel , Leif-Nissen Lundbæk , Michael Huth , Bartlomiej Matejczyk

Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification

Knowledge distillation is a potential solution for model compression. The idea is to make a small student network imitate the target of a large teacher network, then the student network can be competitive to the teacher one. Most previous…

Computer Vision and Pattern Recognition · Computer Science 2017-10-24 Chong Wang , Xipeng Lan , Yangang Zhang

Real-Time Correlation Tracking via Joint Model Compression and Transfer

Correlation filters (CF) have received considerable attention in visual tracking because of their computational efficiency. Leveraging deep features via off-the-shelf CNN models (e.g., VGG), CF trackers achieve state-of-the-art performance…

Computer Vision and Pattern Recognition · Computer Science 2020-06-24 Ning Wang , Wengang Zhou , Yibing Song , Chao Ma , Houqiang Li