English
Related papers

Related papers: Rethinking Centered Kernel Alignment in Knowledge …

200 papers

Most teacher-student frameworks based on knowledge distillation (KD) depend on a strong congruent constraint on instance level. However, they usually ignore the correlation between multiple instances, which is also valuable for knowledge…

Computer Vision and Pattern Recognition · Computer Science 2019-04-04 Baoyun Peng , Xiao Jin , Jiaheng Liu , Shunfeng Zhou , Yichao Wu , Yu Liu , Dongsheng Li , Zhaoning Zhang

Comparing learned neural representations in neural networks is a challenging but important problem, which has been approached in different ways. The Centered Kernel Alignment (CKA) similarity metric, particularly its linear variant, has…

Machine Learning · Computer Science 2022-11-17 MohammadReza Davari , Stefan Horoi , Amine Natik , Guillaume Lajoie , Guy Wolf , Eugene Belilovsky

Centered kernel alignment (CKA) is a popular metric for comparing representations, determining equivalence of networks, and neuroscience research. However, CKA does not account for the underlying manifold and relies on numerous heuristics…

Machine Learning · Computer Science 2025-10-28 Mohammad Tariqul Islam , Du Liu , Deblina Sarkar

Knowledge distillation is a common paradigm for transferring capabilities from larger models to smaller ones. While traditional distillation methods leverage a probabilistic divergence over the output of the teacher and student models,…

Machine Learning · Computer Science 2025-10-01 Prajjwal Bhattarai , Mohammad Amjad , Dmytro Zhylko , Tuka Alhanai

In practical applications of human pose estimation, low-resolution inputs frequently occur, and existing state-of-the-art models perform poorly with low-resolution images. This work focuses on boosting the performance of low-resolution…

Computer Vision and Pattern Recognition · Computer Science 2024-05-21 Zejun Gu , Zhong-Qiu Zhao , Henghui Ding , Hao Shen , Zhao Zhang , De-Shuang Huang

Centred Kernel Alignment (CKA) has recently emerged as a popular metric to compare activations from biological and artificial neural networks (ANNs) in order to quantify the alignment between internal representations derived from stimuli…

Neurons and Cognition · Quantitative Biology 2024-05-03 Alex Murphy , Joel Zylberberg , Alona Fyshe

Knowledge distillation~(KD) has proven to be a highly effective approach for enhancing model performance through a teacher-student training scheme. However, most existing distillation methods are designed under the assumption that the…

Computer Vision and Pattern Recognition · Computer Science 2023-10-31 Zhiwei Hao , Jianyuan Guo , Kai Han , Yehui Tang , Han Hu , Yunhe Wang , Chang Xu

Knowledge amalgamation (KA) aims to learn a compact student model to handle the joint objective from multiple teacher models that are are specialized for their own tasks respectively. Current methods focus on coarsely aligning teachers and…

Computer Vision and Pattern Recognition · Computer Science 2023-07-28 Shangde Gao , Yichao Fu , Ke Liu , Yuqiang Han

Cross-Tokenizer Knowledge Distillation (CTKD) enables knowledge transfer between a large language model and a smaller student, even when they employ different tokenizers. While existing approaches mainly focus on token-level alignment…

Computation and Language · Computer Science 2026-05-05 Quoc Phong Dao , Hoang Son Nguyen , Pham Khanh Chi , Tung Nguyen , Linh Ngo Van , Nguyen Thi Ngoc Diep , Trung Le

Knowledge distillation (KD) is a technique for transferring knowledge from complex teacher models to simpler student models, significantly enhancing model efficiency and accuracy. It has demonstrated substantial advancements in various…

Computation and Language · Computer Science 2025-04-21 Junjie Yang , Junhao Song , Xudong Han , Ziqian Bi , Tianyang Wang , Chia Xin Liang , Xinyuan Song , Yichao Zhang , Qian Niu , Benji Peng , Keyu Chen , Ming Liu

Knowledge Distillation (KD) has emerged as a pivotal technique for neural network compression and performance enhancement. Most KD methods aim to transfer dark knowledge from a cumbersome teacher model to a lightweight student model based…

Machine Learning · Computer Science 2024-10-10 Wenqi Niu , Yingchao Wang , Guohui Cai , Hanpo Hou

Knowledge distillation is an effective way for model compression in deep learning. Given a large model (i.e., teacher model), it aims to improve the performance of a compact model (i.e., student model) by transferring the information from…

Machine Learning · Computer Science 2022-03-31 Qi Qian , Hao Li , Juhua Hu

Knowledge Distillation (KD) aims to transfer knowledge from a large teacher model to a smaller student model. While contrastive learning has shown promise in self-supervised learning by creating discriminative representations, its…

Computer Vision and Pattern Recognition · Computer Science 2025-05-14 Nikolaos Giakoumoglou , Tania Stathaki

Knowledge distillation (KD) has become an important technique for model compression and knowledge transfer. In this work, we first perform a comprehensive analysis of the knowledge transferred by different KD methods. We demonstrate that…

Computer Vision and Pattern Recognition · Computer Science 2021-06-07 Fei Ding , Yin Yang , Hongxin Hu , Venkat Krovi , Feng Luo

Knowledge distillation is a mainstream algorithm in model compression by transferring knowledge from the larger model (teacher) to the smaller model (student) to improve the performance of student. Despite many efforts, existing methods…

Computer Vision and Pattern Recognition · Computer Science 2024-10-21 Muhe Ding , Jianlong Wu , Xue Dong , Xiaojie Li , Pengda Qin , Tian Gan , Liqiang Nie

Large Vision-Language Models (VLMs) are successful in addressing a multitude of vision-language understanding tasks, such as Visual Question Answering (VQA), but their memory and compute requirements remain a concern for practical…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Nikolaos Gkalelis , Vasileios Mezaris

In instance-level detection tasks (e.g., object detection), reducing input resolution is an easy option to improve runtime efficiency. However, this option traditionally hurts the detection performance much. This paper focuses on boosting…

Computer Vision and Pattern Recognition · Computer Science 2021-09-16 Lu Qi , Jason Kuen , Jiuxiang Gu , Zhe Lin , Yi Wang , Yukang Chen , Yanwei Li , Jiaya Jia

Knowledge distillation is initially introduced to utilize additional supervision from a single teacher model for the student model training. To boost the student performance, some recent variants attempt to exploit diverse knowledge sources…

Machine Learning · Computer Science 2022-02-15 Hailin Zhang , Defang Chen , Can Wang

Large language models (LLMs) achieve state-of-the-art (SOTA) performance across language tasks, but are costly to deploy due to their size and resource demands. Knowledge Distillation (KD) addresses this by training smaller Student models…

Computation and Language · Computer Science 2026-05-19 Stella Eva Tsiapali , Cong-Thanh Do , Kate Knill

Recent advances in deep learning has lead to rapid developments in the field of image retrieval. However, the best performing architectures incur significant computational cost. Recent approaches tackle this issue using knowledge…

Computer Vision and Pattern Recognition · Computer Science 2020-07-14 Zakaria Laskar , Juho Kannala
‹ Prev 1 2 3 10 Next ›