English
Related papers

Related papers: Precise Knowledge Transfer via Flow Matching

200 papers

Knowledge distillation (KD), known for its ability to transfer knowledge from a cumbersome network (teacher) to a lightweight one (student) without altering the architecture, has been garnering increasing attention. Two primary categories…

Computer Vision and Pattern Recognition · Computer Science 2024-09-30 Yaomin Huang , Zaomin Yan , Chaomin Shen , Faming Fang , Guixu Zhang

Knowledge distillation (KD) is an essential technique to compress large language models (LLMs) into smaller ones. However, despite the distinct roles of the student model and the teacher model in KD, most existing frameworks still use a…

Computation and Language · Computer Science 2026-03-25 Songming Zhang , Xue Zhang , Tong Zhang , Bojie Hu , Yufeng Chen , Jinan Xu

Knowledge distillation (KD) has become an important technique for model compression and knowledge transfer. In this work, we first perform a comprehensive analysis of the knowledge transferred by different KD methods. We demonstrate that…

Computer Vision and Pattern Recognition · Computer Science 2021-06-07 Fei Ding , Yin Yang , Hongxin Hu , Venkat Krovi , Feng Luo

Knowledge Distillation (KD) seeks to transfer the knowledge of a teacher, towards a student neural net. This process is often done by matching the networks' predictions (i.e., their output), but, recently several works have proposed to…

Machine Learning · Statistics 2025-09-09 Eduardo Fernandes Montesuma

Knowledge Distillation (KD) based methods adopt the one-way Knowledge Transfer (KT) scheme in which training a lower-capacity student network is guided by a pre-trained high-capacity teacher network. Recently, Deep Mutual Learning (DML)…

Computer Vision and Pattern Recognition · Computer Science 2020-08-19 Anbang Yao , Dawei Sun

Knowledge Distillation (KD) methods are capable of transferring the knowledge encoded in a large and complex teacher into a smaller and faster student. Early methods were usually limited to transferring the knowledge only between the last…

Computer Vision and Pattern Recognition · Computer Science 2020-05-05 Nikolaos Passalis , Maria Tzelepi , Anastasios Tefas

Despite deep neural networks have demonstrated extraordinary power in various applications, their superior performances are at expense of high storage and computational costs. Consequently, the acceleration and compression of neural…

Computer Vision and Pattern Recognition · Computer Science 2017-12-20 Zehao Huang , Naiyan Wang

Existing techniques often attempt to make knowledge transfer from a powerful machine translation (MT) to speech translation (ST) model with some elaborate techniques, which often requires transcription as extra input during training.…

Computation and Language · Computer Science 2023-04-21 Hao Zhang , Nianwen Si , Yaqi Chen , Wenlin Zhang , Xukui Yang , Dan Qu , Zhen Li

Flow matching (FM) is a general framework for defining probability paths via Ordinary Differential Equations (ODEs) to transform between noise and data samples. Recent approaches attempt to straighten these flow trajectories to generate…

Computer Vision and Pattern Recognition · Computer Science 2024-07-03 Ling Yang , Zixiang Zhang , Zhilong Zhang , Xingchao Liu , Minkai Xu , Wentao Zhang , Chenlin Meng , Stefano Ermon , Bin Cui

Knowledge distillation is a popular paradigm for learning portable neural networks by transferring the knowledge from a large model into a smaller one. Most existing approaches enhance the student model by utilizing the similarity…

Computer Vision and Pattern Recognition · Computer Science 2021-03-19 Haoran Zhao , Kun Gong , Xin Sun , Junyu Dong , Hui Yu

Knowledge distillation (KD) aims to transfer the knowledge of a more capable yet cumbersome teacher model to a lightweight student model. In recent years, relation-based KD methods have fallen behind, as their instance-matching counterparts…

Computer Vision and Pattern Recognition · Computer Science 2025-08-01 Weijia Zhang , Fei Xie , Weidong Cai , Chao Ma

Knowledge distillation (KD) transfers knowledge from large teacher models to compact student models, enabling efficient deployment on resource constrained devices. While diverse KD methods, including response based, feature based, and…

Machine Learning · Computer Science 2026-01-23 Yinxi Tian , Changwu Huang , Ke Tang , Xin Yao

Knowledge distillation (KD) is an effective model compression technique that transfers knowledge from a high-performance teacher to a lightweight student, reducing computational and storage costs while maintaining competitive accuracy.…

Computer Vision and Pattern Recognition · Computer Science 2025-11-17 Fengming Yu , Haiwei Pan , Kejia Zhang , Jian Guan , Haiying Jiang

We propose ClassroomKD, a novel multi-mentor knowledge distillation framework inspired by classroom environments to enhance knowledge transfer between the student and multiple mentors with different knowledge levels. Unlike traditional…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Shalini Sarode , Muhammad Saif Ullah Khan , Tahira Shehzadi , Didier Stricker , Muhammad Zeshan Afzal

Knowledge distillation is an effective and stable method for model compression via knowledge transfer. Conventional knowledge distillation (KD) is to transfer knowledge from a large and well pre-trained teacher network to a small student…

Computer Vision and Pattern Recognition · Computer Science 2021-11-24 Zhiqiang Liu , Yanxia Liu , Chengkai Huang

Knowledge distillation (KD) has been extensively employed to transfer the knowledge from a large teacher model to the smaller students, where the parameters of the teacher are fixed (or partially) during training. Recent studies show that…

Machine Learning · Computer Science 2022-06-01 Jun Rao , Xv Meng , Liang Ding , Shuhan Qi , Dacheng Tao

Knowledge distillation (KD), as an efficient and effective model compression technique, has been receiving considerable attention in deep learning. The key to its success is to transfer knowledge from a large teacher network to a small…

Machine Learning · Computer Science 2021-01-28 Liyuan Sun , Jianping Gou , Baosheng Yu , Lan Du , Dacheng Tao

Knowledge Distillation (KD) aims to transfer knowledge in a teacher-student framework, by providing the predictions of the teacher network to the student network in the training stage to help the student network generalize better. It can…

Computer Vision and Pattern Recognition · Computer Science 2019-09-25 SeongUk Park , Nojun Kwak

Knowledge distillation (KD) has demonstrated its effectiveness to boost the performance of graph neural networks (GNNs), where its goal is to distill knowledge from a deeper teacher GNN into a shallower student GNN. However, it is actually…

Machine Learning · Computer Science 2023-03-28 Kaituo Feng , Changsheng Li , Ye Yuan , Guoren Wang

Most teacher-student frameworks based on knowledge distillation (KD) depend on a strong congruent constraint on instance level. However, they usually ignore the correlation between multiple instances, which is also valuable for knowledge…

Computer Vision and Pattern Recognition · Computer Science 2019-04-04 Baoyun Peng , Xiao Jin , Jiaheng Liu , Shunfeng Zhou , Yichao Wu , Yu Liu , Dongsheng Li , Zhaoning Zhang
‹ Prev 1 2 3 10 Next ›