Contrastive Learning for Multi-Object Tracking with Transformers

Pierre-François De Plaen; Nicola Marinello; Marc Proesmans; Tinne Tuytelaars; Luc Van Gool

doi:10.1109/WACV57701.2024.00672

Contrastive Learning for Multi-Object Tracking with Transformers

Computer Vision and Pattern Recognition 2025-05-16 v1

Authors: Pierre-François De Plaen , Nicola Marinello , Marc Proesmans , Tinne Tuytelaars , Luc Van Gool

View on arXiv ↗ PDF ↗ DOI ↗

Abstract

The DEtection TRansformer (DETR) opened new possibilities for object detection by modeling it as a translation task: converting image features into object-level representations. Previous works typically add expensive modules to DETR to perform Multi-Object Tracking (MOT), resulting in more complicated architectures. We instead show how DETR can be turned into a MOT model by employing an instance-level contrastive loss, a revised sampling strategy and a lightweight assignment method. Our training scheme learns object appearances while preserving detection capabilities and with little overhead. Its performance surpasses the previous state-of-the-art by +2.6 mMOTA on the challenging BDD100K dataset and is comparable to existing transformer-based methods on the MOT17 dataset.

Keywords

multi-object tracking object detection domain adaptive object detection

Cite

@article{arxiv.2311.08043,
  title  = {Contrastive Learning for Multi-Object Tracking with Transformers},
  author = {Pierre-François De Plaen and Nicola Marinello and Marc Proesmans and Tinne Tuytelaars and Luc Van Gool},
  journal= {arXiv preprint arXiv:2311.08043},
  year   = {2025}
}

Comments

WACV 2024

Contrastive Learning for Multi-Object Tracking with Transformers

Abstract

Keywords

Cite

Comments

Related papers