English
Related papers

Related papers: Exploring Inconsistent Knowledge Distillation for …

200 papers

Knowledge Distillation (KD) is a model-agnostic technique to improve model quality while having a fixed capacity budget. It is a commonly used technique for model compression, where a larger capacity teacher model with better quality is…

Machine Learning · Computer Science 2021-03-02 Jiaxi Tang , Rakesh Shivanna , Zhe Zhao , Dong Lin , Anima Singh , Ed H. Chi , Sagar Jain

Knowledge distillation (KD) is a technique for transferring knowledge from complex teacher models to simpler student models, significantly enhancing model efficiency and accuracy. It has demonstrated substantial advancements in various…

Computation and Language · Computer Science 2025-04-21 Junjie Yang , Junhao Song , Xudong Han , Ziqian Bi , Tianyang Wang , Chia Xin Liang , Xinyuan Song , Yichao Zhang , Qian Niu , Benji Peng , Keyu Chen , Ming Liu

Knowledge Distillation (KD) utilizes training data as a transfer set to transfer knowledge from a complex network (Teacher) to a smaller network (Student). Several works have recently identified many scenarios where the training data may…

Computer Vision and Pattern Recognition · Computer Science 2021-10-28 Gaurav Kumar Nayak , Monish Keswani , Sharan Seshadri , Anirban Chakraborty

Knowledge distillation (KD) has gained much attention due to its effectiveness in compressing large-scale pre-trained models. In typical KD methods, the small student model is trained to match the soft targets generated by the big teacher…

Machine Learning · Computer Science 2021-09-13 Yitao Liu , Tianxiang Sun , Xipeng Qiu , Xuanjing Huang

Knowledge distillation (KD) is widely used for training a compact model with the supervision of another large model, which could effectively improve the performance. Previous methods mainly focus on two aspects: 1) training the student to…

Computer Vision and Pattern Recognition · Computer Science 2020-07-27 Tiancheng Wen , Shenqi Lai , Xueming Qian

Knowledge distillation (KD) is one of the prominent techniques for model compression. In this method, the knowledge of a large network (teacher) is distilled into a model (student) with usually significantly fewer parameters. KD tries to…

Machine Learning · Computer Science 2023-01-31 Aref Jafari , Mehdi Rezagholizadeh , Ali Ghodsi

Knowledge Distillation (KD) is a widely-used technology to inherit information from cumbersome teacher models to compact student models, consequently realizing model compression and acceleration. Compared with image classification, object…

Computer Vision and Pattern Recognition · Computer Science 2021-12-10 Gang Li , Xiang Li , Yujie Wang , Shanshan Zhang , Yichao Wu , Ding Liang

Knowledge Distillation (KD) aims to transfer knowledge from a large teacher model to a smaller student model. While contrastive learning has shown promise in self-supervised learning by creating discriminative representations, its…

Computer Vision and Pattern Recognition · Computer Science 2025-05-14 Nikolaos Giakoumoglou , Tania Stathaki

Knowledge distillation (KD) is an effective method for compressing models in object detection tasks. Due to limited computational capability, UAV-based object detection (UAV-OD) widely adopt the KD technique to obtain lightweight detectors.…

Computer Vision and Pattern Recognition · Computer Science 2024-08-22 Liang Yao , Fan Liu , Chuanyi Zhang , Zhiquan Ou , Ting Wu

Deep learning models have demonstrated remarkable success in object detection, yet their complexity and computational intensity pose a barrier to deploying them in real-world applications (e.g., self-driving perception). Knowledge…

Computer Vision and Pattern Recognition · Computer Science 2023-03-09 Qizhen Lan , Qing Tian

Knowledge Distillation is a technique which aims to utilize dark knowledge to compress and transfer information from a vast, well-trained neural network (teacher model) to a smaller, less capable neural network (student model) with improved…

Computer Vision and Pattern Recognition · Computer Science 2022-01-28 Fahad Rahman Amik , Ahnaf Ismat Tasin , Silvia Ahmed , M. M. Lutfe Elahi , Nabeel Mohammed

Knowledge distillation (KD) is commonly deemed as an effective model compression technique in which a compact model (student) is trained under the supervision of a larger pretrained model or an ensemble of models (teacher). Various…

Machine Learning · Computer Science 2020-07-08 Fahad Sarfraz , Elahe Arani , Bahram Zonooz

In the era of large scale pretrained models, Knowledge Distillation (KD) serves an important role in transferring the wisdom of computationally heavy teacher models to lightweight, efficient student models while preserving performance.…

Machine Learning · Computer Science 2023-11-07 Alex Wilf , Alex Tianyi Xu , Paul Pu Liang , Alexander Obolenskiy , Daniel Fried , Louis-Philippe Morency

Knowledge distillation (KD) is a simple and successful method to transfer knowledge from a teacher to a student model solely based on functional activity. However, current KD has a few shortcomings: it has recently been shown that this…

Computer Vision and Pattern Recognition · Computer Science 2023-05-26 Arne F. Nix , Max F. Burg , Fabian H. Sinz

Despite substantial progress in 3D object detection, advanced 3D detectors often suffer from heavy computation overheads. To this end, we explore the potential of knowledge distillation (KD) for developing efficient 3D object detectors,…

Computer Vision and Pattern Recognition · Computer Science 2022-10-17 Jihan Yang , Shaoshuai Shi , Runyu Ding , Zhe Wang , Xiaojuan Qi

Does Knowledge Distillation (KD) really work? Conventional wisdom viewed it as a knowledge transfer procedure where a perfect mimicry of the student to its teacher is desired. However, paradoxical studies indicate that closely replicating…

Machine Learning · Computer Science 2024-05-03 Chenqi Guo , Shiwei Zhong , Xiaofeng Liu , Qianli Feng , Yinglong Ma

Knowledge distillation (KD) has been widely used to transfer knowledge from large, accurate models (teachers) to smaller, efficient ones (students). Recent methods have explored enforcing consistency by incorporating causal interpretations…

Computer Vision and Pattern Recognition · Computer Science 2025-07-17 Nikolaos Giakoumoglou , Tania Stathaki

Knowledge Distillation (KD) is a common method for transferring the ``knowledge'' learned by one machine learning model (the \textit{teacher}) into another model (the \textit{student}), where typically, the teacher has a greater capacity…

Machine Learning · Computer Science 2020-04-21 Jie Fu , Xue Geng , Zhijian Duan , Bohan Zhuang , Xingdi Yuan , Adam Trischler , Jie Lin , Chris Pal , Hao Dong

Knowledge distillation (KD) improves the performance of a low-complexity student model with the help of a more powerful teacher. The teacher in KD is a black-box model, imparting knowledge to the student only through its predictions. This…

Machine Learning · Computer Science 2023-10-05 Sayantan Chowdhury , Ben Liang , Ali Tizghadam , Ilijc Albanese

Knowledge distillation (KD) has been proven to be useful for training compact object detection models. However, we observe that KD is often effective when the teacher model and student counterpart share similar proposal information. This…

Computer Vision and Pattern Recognition · Computer Science 2022-10-10 Sheng Xu , Yanjing Li , Bohan Zeng , Teli ma , Baochang Zhang , Xianbin Cao , Peng Gao , Jinhu Lv
‹ Prev 1 2 3 10 Next ›