English

DeepBox: Learning Objectness with Convolutional Networks

Computer Vision and Pattern Recognition 2015-09-29 v2

Abstract

Existing object proposal approaches use primarily bottom-up cues to rank proposals, while we believe that objectness is in fact a high level construct. We argue for a data-driven, semantic approach for ranking object proposals. Our framework, which we call DeepBox, uses convolutional neural networks (CNNs) to rerank proposals from a bottom-up method. We use a novel four-layer CNN architecture that is as good as much larger networks on the task of evaluating objectness while being much faster. We show that DeepBox significantly improves over the bottom-up ranking, achieving the same recall with 500 proposals as achieved by bottom-up methods with 2000. This improvement generalizes to categories the CNN has never seen before and leads to a 4.5-point gain in detection mAP. Our implementation achieves this performance while running at 260 ms per image.

Keywords

Cite

@article{arxiv.1505.02146,
  title  = {DeepBox: Learning Objectness with Convolutional Networks},
  author = {Weicheng Kuo and Bharath Hariharan and Jitendra Malik},
  journal= {arXiv preprint arXiv:1505.02146},
  year   = {2015}
}

Comments

ICCV 2015 Camera-ready version

R2 v1 2026-06-22T09:30:41.894Z