Object-Centric Multiple Object Tracking

Zixu Zhao; Jiaze Wang; Max Horn; Yizhuo Ding; Tong He; Zechen Bai; Dominik Zietlow; Carl-Johann Simon-Gabriel; Bing Shuai; Zhuowen Tu; Thomas Brox; Bernt Schiele; Yanwei Fu; Francesco Locatello; Zheng Zhang; Tianjun Xiao

Object-Centric Multiple Object Tracking

Computer Vision and Pattern Recognition 2023-09-06 v2

Authors: Zixu Zhao , Jiaze Wang , Max Horn , Yizhuo Ding , Tong He , Zechen Bai , Dominik Zietlow , Carl-Johann Simon-Gabriel , Bing Shuai , Zhuowen Tu , Thomas Brox , Bernt Schiele , Yanwei Fu , Francesco Locatello , Zheng Zhang , Tianjun Xiao

View on arXiv ↗ PDF ↗

Abstract

Unsupervised object-centric learning methods allow the partitioning of scenes into entities without additional localization information and are excellent candidates for reducing the annotation burden of multiple-object tracking (MOT) pipelines. Unfortunately, they lack two key properties: objects are often split into parts and are not consistently tracked over time. In fact, state-of-the-art models achieve pixel-level accuracy and temporal consistency by relying on supervised object detection with additional ID labels for the association through time. This paper proposes a video object-centric model for MOT. It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module that builds complete object prototypes to handle occlusions. Benefited from object-centric learning, we only require sparse detection labels (0%-6.25%) for object localization and feature binding. Relying on our self-supervised Expectation-Maximization-inspired loss for object association, our approach requires no ID labels. Our experiments significantly narrow the gap between the existing object-centric model and the fully supervised state-of-the-art and outperform several unsupervised trackers.

Keywords

multi-object tracking object detection video segmentation

Cite

@article{arxiv.2309.00233,
  title  = {Object-Centric Multiple Object Tracking},
  author = {Zixu Zhao and Jiaze Wang and Max Horn and Yizhuo Ding and Tong He and Zechen Bai and Dominik Zietlow and Carl-Johann Simon-Gabriel and Bing Shuai and Zhuowen Tu and Thomas Brox and Bernt Schiele and Yanwei Fu and Francesco Locatello and Zheng Zhang and Tianjun Xiao},
  journal= {arXiv preprint arXiv:2309.00233},
  year   = {2023}
}

Comments

ICCV 2023 camera-ready version

Object-Centric Multiple Object Tracking

Abstract

Keywords

Cite

Comments

Related papers