Visual Recognition by Request

Chufeng Tang; Lingxi Xie; Xiaopeng Zhang; Xiaolin Hu; Qi Tian

Visual Recognition by Request

Computer Vision and Pattern Recognition 2022-12-13 v2

Authors: Chufeng Tang , Lingxi Xie , Xiaopeng Zhang , Xiaolin Hu , Qi Tian

Abstract

Humans have the ability of recognizing visual semantics in an unlimited granularity, but existing visual recognition algorithms cannot achieve this goal. In this paper, we establish a new paradigm named visual recognition by request (ViRReq) to bridge the gap. The key lies in decomposing visual recognition into atomic tasks named requests and leveraging a knowledge base, a hierarchical and text-based dictionary, to assist task definition. ViRReq allows for (i) learning complicated whole-part hierarchies from highly incomplete annotations and (ii) inserting new concepts with minimal efforts. We also establish a solid baseline by integrating language-driven recognition into recent semantic and instance segmentation methods, and demonstrate its flexible recognition ability on CPP and ADE20K, two datasets with hierarchical whole-part annotations.

Keywords

video retrieval image retrieval image representation learning

Cite

@article{arxiv.2207.14227,
  title  = {Visual Recognition by Request},
  author = {Chufeng Tang and Lingxi Xie and Xiaopeng Zhang and Xiaolin Hu and Qi Tian},
  journal= {arXiv preprint arXiv:2207.14227},
  year   = {2022}
}

Visual Recognition by Request

Abstract

Keywords

Cite

Related papers