English

Instance-Conditional Knowledge Distillation for Object Detection

Computer Vision and Pattern Recognition 2022-01-07 v2

Abstract

Knowledge distillation has shown great success in classification, however, it is still challenging for detection. In a typical image for detection, representations from different locations may have different contributions to detection targets, making the distillation hard to balance. In this paper, we propose a conditional distillation framework to distill the desired knowledge, namely knowledge that is beneficial in terms of both classification and localization for every instance. The framework introduces a learnable conditional decoding module, which retrieves information given each target instance as query. Specifically, we encode the condition information as query and use the teacher's representations as key. The attention between query and key is used to measure the contribution of different features, guided by a localization-recognition-sensitive auxiliary task. Extensive experiments demonstrate the efficacy of our method: we observe impressive improvements under various settings. Notably, we boost RetinaNet with ResNet-50 backbone from 37.4 to 40.7 mAP (+3.3) under 1x schedule, that even surpasses the teacher (40.4 mAP) with ResNet-101 backbone under 3x schedule. Code has been released on https://github.com/megvii-research/ICD.

Keywords

Cite

@article{arxiv.2110.12724,
  title  = {Instance-Conditional Knowledge Distillation for Object Detection},
  author = {Zijian Kang and Peizhen Zhang and Xiangyu Zhang and Jian Sun and Nanning Zheng},
  journal= {arXiv preprint arXiv:2110.12724},
  year   = {2022}
}

Comments

Accepted by NeurIPS 2021

R2 v1 2026-06-24T07:09:08.500Z