SOIT: Segmenting Objects with Instance-Aware Transformers

Xiaodong Yu; Dahu Shi; Xing Wei; Ye Ren; Tingqun Ye; Wenming Tan

SOIT: Segmenting Objects with Instance-Aware Transformers

Computer Vision and Pattern Recognition 2021-12-24 v2

Authors: Xiaodong Yu , Dahu Shi , Xing Wei , Ye Ren , Tingqun Ye , Wenming Tan

Abstract

This paper presents an end-to-end instance segmentation framework, termed SOIT, that Segments Objects with Instance-aware Transformers. Inspired by DETR \cite{carion2020end}, our method views instance segmentation as a direct set prediction problem and effectively removes the need for many hand-crafted components like RoI cropping, one-to-many label assignment, and non-maximum suppression (NMS). In SOIT, multiple queries are learned to directly reason a set of object embeddings of semantic category, bounding-box location, and pixel-wise mask in parallel under the global image context. The class and bounding-box can be easily embedded by a fixed-length vector. The pixel-wise mask, especially, is embedded by a group of parameters to construct a lightweight instance-aware transformer. Afterward, a full-resolution mask is produced by the instance-aware transformer without involving any RoI-based operation. Overall, SOIT introduces a simple single-stage instance segmentation framework that is both RoI- and NMS-free. Experimental results on the MS COCO dataset demonstrate that SOIT outperforms state-of-the-art instance segmentation approaches significantly. Moreover, the joint learning of multiple tasks in a unified query embedding can also substantially improve the detection performance. Code is available at \url{https://github.com/yuxiaodongHRI/SOIT}.

Keywords

video segmentation image segmentation object detection

Cite

@article{arxiv.2112.11037,
  title  = {SOIT: Segmenting Objects with Instance-Aware Transformers},
  author = {Xiaodong Yu and Dahu Shi and Xing Wei and Ye Ren and Tingqun Ye and Wenming Tan},
  journal= {arXiv preprint arXiv:2112.11037},
  year   = {2021}
}

Comments

AAAI 2022

SOIT: Segmenting Objects with Instance-Aware Transformers

Abstract

Keywords

Cite

Comments

Related papers