English
Related papers

Related papers: Preference-Consistent Knowledge Distillation for R…

200 papers

Knowledge distillation is a mainstream algorithm in model compression by transferring knowledge from the larger model (teacher) to the smaller model (student) to improve the performance of student. Despite many efforts, existing methods…

Computer Vision and Pattern Recognition · Computer Science 2024-10-21 Muhe Ding , Jianlong Wu , Xue Dong , Xiaojie Li , Pengda Qin , Tian Gan , Liqiang Nie

Conventionally, during the knowledge distillation process (e.g. feature distillation), an additional projector is often required to perform feature transformation due to the dimension mismatch between the teacher and the student networks.…

Computer Vision and Pattern Recognition · Computer Science 2023-10-27 Yudong Chen , Sen Wang , Jiajun Liu , Xuwei Xu , Frank de Hoog , Brano Kusy , Zi Huang

In knowledge distillation, previous feature distillation methods mainly focus on the design of loss functions and the selection of the distilled layers, while the effect of the feature projector between the student and the teacher remains…

Computer Vision and Pattern Recognition · Computer Science 2023-03-02 Yudong Chen , Sen Wang , Jiajun Liu , Xuwei Xu , Frank de Hoog , Zi Huang

Knowledge distillation is an effective method for training small and efficient deep learning models. However, the efficacy of a single method can degenerate when transferring to other tasks, modalities, or even other architectures. To…

Computer Vision and Pattern Recognition · Computer Science 2024-03-12 Roy Miles , Ismail Elezi , Jiankang Deng

Knowledge Distillation (KD) aims to transfer knowledge from a large teacher model to a smaller student model. While contrastive learning has shown promise in self-supervised learning by creating discriminative representations, its…

Computer Vision and Pattern Recognition · Computer Science 2025-05-14 Nikolaos Giakoumoglou , Tania Stathaki

In this paper, we analyze the feature-based knowledge distillation for recommendation from the frequency perspective. By defining knowledge as different frequency components of the features, we theoretically demonstrate that regular…

Information Retrieval · Computer Science 2025-01-14 Zhangchi Zhu , Wei Zhang

Knowledge distillation aims to compress a powerful yet cumbersome teacher model into a lightweight student model without much sacrifice of performance. For this purpose, various approaches have been proposed over the past few years,…

Computer Vision and Pattern Recognition · Computer Science 2022-03-29 Defang Chen , Jian-Ping Mei , Hailin Zhang , Can Wang , Yan Feng , Chun Chen

Knowledge Distillation (KD) refers to transferring knowledge from a large model to a smaller one, which is widely used to enhance model performance in machine learning. It tries to align embedding spaces generated from the teacher and the…

Computer Vision and Pattern Recognition · Computer Science 2020-11-03 Weidong Shi , Guanghui Ren , Yunpeng Chen , Shuicheng Yan

Knowledge distillation is a common paradigm for transferring capabilities from larger models to smaller ones. While traditional distillation methods leverage a probabilistic divergence over the output of the teacher and student models,…

Machine Learning · Computer Science 2025-10-01 Prajjwal Bhattarai , Mohammad Amjad , Dmytro Zhylko , Tuka Alhanai

The representation gap between teacher and student is an emerging topic in knowledge distillation (KD). To reduce the gap and improve the performance, current methods often resort to complicated training schemes, loss functions, and feature…

Computer Vision and Pattern Recognition · Computer Science 2023-12-05 Tao Huang , Yuan Zhang , Mingkai Zheng , Shan You , Fei Wang , Chen Qian , Chang Xu

As a promising solution for model compression, knowledge distillation (KD) has been applied in recommender systems (RS) to reduce inference latency. Traditional solutions first train a full teacher model from the training data, and then…

Information Retrieval · Computer Science 2022-11-29 Gang Chen , Jiawei Chen , Fuli Feng , Sheng Zhou , Xiangnan He

Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage. To reduce the necessity of…

Computer Vision and Pattern Recognition · Computer Science 2021-03-16 Mingi Ji , Seungjae Shin , Seunghyun Hwang , Gibeom Park , Il-Chul Moon

Knowledge Distillation (KD) is a widely-used technology to inherit information from cumbersome teacher models to compact student models, consequently realizing model compression and acceleration. Compared with image classification, object…

Computer Vision and Pattern Recognition · Computer Science 2021-12-10 Gang Li , Xiang Li , Yujie Wang , Shanshan Zhang , Yichao Wu , Ding Liang

Despite its breakthrough in classification problems, Knowledge distillation (KD) to recommendation models and ranking problems has not been studied well in the previous literature. This dissertation is devoted to developing knowledge…

Information Retrieval · Computer Science 2024-07-22 SeongKu Kang

Knowledge Distillation (KD) for object detection aims to train a compact detector by transferring knowledge from a teacher model. Since the teacher model perceives data in a way different from humans, existing KD methods only distill…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Jiawei Liang , Siyuan Liang , Aishan Liu , Ke Ma , Jingzhi Li , Xiaochun Cao

Knowledge Distillation (KD) is a powerful approach for compressing a large model into a smaller, more efficient model, particularly beneficial for latency-sensitive applications like recommender systems. However, current KD research…

Information Retrieval · Computer Science 2024-08-28 Nikhil Khani , Shuo Yang , Aniruddh Nath , Yang Liu , Pendo Abbo , Li Wei , Shawn Andrews , Maciej Kula , Jarrod Kahn , Zhe Zhao , Lichan Hong , Ed Chi

In this paper we revisit the efficacy of knowledge distillation as a function matching and metric learning problem. In doing so we verify three important design decisions, namely the normalisation, soft maximum function, and projection…

Computer Vision and Pattern Recognition · Computer Science 2024-02-02 Roy Miles , Krystian Mikolajczyk

Knowledge distillation (KD) is an effective model compression technique that transfers knowledge from a high-performance teacher to a lightweight student, reducing computational and storage costs while maintaining competitive accuracy.…

Computer Vision and Pattern Recognition · Computer Science 2025-11-17 Fengming Yu , Haiwei Pan , Kejia Zhang , Jian Guan , Haiying Jiang

Knowledge distillation field delicately designs various types of knowledge to shrink the performance gap between compact student and large-scale teacher. These existing distillation approaches simply focus on the improvement of…

Computer Vision and Pattern Recognition · Computer Science 2021-09-28 Xuanyang Zhang , Xiangyu Zhang , Jian Sun

Knowledge distillation (KD) has been applied to various tasks successfully, and mainstream methods typically boost the student model via spatial imitation losses. However, the consecutive downsamplings induced in the spatial domain of…

Computer Vision and Pattern Recognition · Computer Science 2024-05-24 Yuan Zhang , Tao Huang , Jiaming Liu , Tao Jiang , Kuan Cheng , Shanghang Zhang
‹ Prev 1 2 3 10 Next ›