English

Improving Object Detection with Selective Self-supervised Self-training

Computer Vision and Pattern Recognition 2020-07-28 v2

Abstract

We study how to leverage Web images to augment human-curated object detection datasets. Our approach is two-pronged. On the one hand, we retrieve Web images by image-to-image search, which incurs less domain shift from the curated data than other search methods. The Web images are diverse, supplying a wide variety of object poses, appearances, their interactions with the context, etc. On the other hand, we propose a novel learning method motivated by two parallel lines of work that explore unlabeled data for image classification: self-training and self-supervised learning. They fail to improve object detectors in their vanilla forms due to the domain gap between the Web images and curated datasets. To tackle this challenge, we propose a selective net to rectify the supervision signals in Web images. It not only identifies positive bounding boxes but also creates a safe zone for mining hard negative boxes. We report state-of-the-art results on detecting backpacks and chairs from everyday scenes, along with other challenging object classes.

Keywords

Cite

@article{arxiv.2007.09162,
  title  = {Improving Object Detection with Selective Self-supervised Self-training},
  author = {Yandong Li and Di Huang and Danfeng Qin and Liqiang Wang and Boqing Gong},
  journal= {arXiv preprint arXiv:2007.09162},
  year   = {2020}
}

Comments

Accepted to ECCV 2020

R2 v1 2026-06-23T17:12:18.628Z