English

Complementary Relation Contrastive Distillation

Computer Vision and Pattern Recognition 2021-03-31 v1

Abstract

Knowledge distillation aims to transfer representation ability from a teacher model to a student model. Previous approaches focus on either individual representation distillation or inter-sample similarity preservation. While we argue that the inter-sample relation conveys abundant information and needs to be distilled in a more effective way. In this paper, we propose a novel knowledge distillation method, namely Complementary Relation Contrastive Distillation (CRCD), to transfer the structural knowledge from the teacher to the student. Specifically, we estimate the mutual relation in an anchor-based way and distill the anchor-student relation under the supervision of its corresponding anchor-teacher relation. To make it more robust, mutual relations are modeled by two complementary elements: the feature and its gradient. Furthermore, the low bound of mutual information between the anchor-teacher relation distribution and the anchor-student relation distribution is maximized via relation contrastive loss, which can distill both the sample representation and the inter-sample relations. Experiments on different benchmarks demonstrate the effectiveness of our proposed CRCD.

Keywords

Cite

@article{arxiv.2103.16367,
  title  = {Complementary Relation Contrastive Distillation},
  author = {Jinguo Zhu and Shixiang Tang and Dapeng Chen and Shijie Yu and Yakun Liu and Aijun Yang and Mingzhe Rong and Xiaohua Wang},
  journal= {arXiv preprint arXiv:2103.16367},
  year   = {2021}
}

Comments

CVPR2021 Poster

R2 v1 2026-06-24T00:41:37.484Z