English

Modulating Localization and Classification for Harmonized Object Detection

Computer Vision and Pattern Recognition 2021-03-26 v2

Abstract

Object detection involves two sub-tasks, i.e. localizing objects in an image and classifying them into various categories. For existing CNN-based detectors, we notice the widespread divergence between localization and classification, which leads to degradation in performance. In this work, we propose a mutual learning framework to modulate the two tasks. In particular, the two tasks are forced to learn from each other with a novel mutual labeling strategy. Besides, we introduce a simple yet effective IoU rescoring scheme, which further reduces the divergence. Moreover, we define a Spearman rank correlation-based metric to quantify the divergence, which correlates well with the detection performance. The proposed approach is general-purpose and can be easily injected into existing detectors such as FCOS and RetinaNet. We achieve a significant performance gain over the baseline detectors on the COCO dataset.

Keywords

Cite

@article{arxiv.2103.08958,
  title  = {Modulating Localization and Classification for Harmonized Object Detection},
  author = {Taiheng Zhang and Qiaoyong Zhong and Shiliang Pu and Di Xie},
  journal= {arXiv preprint arXiv:2103.08958},
  year   = {2021}
}

Comments

Accepted by ICME 2021

R2 v1 2026-06-24T00:13:47.978Z