Related papers: ReMOTS: Self-Supervised Refining Multi-Object Trac…

MOTS: Multi-Object Tracking and Segmentation

This paper extends the popular task of multi-object tracking to multi-object tracking and segmentation (MOTS). Towards this goal, we create dense pixel-level annotations for two existing tracking datasets using a semi-automatic annotation…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Paul Voigtlaender , Michael Krause , Aljosa Osep , Jonathon Luiten , Berin Balachandar Gnana Sekar , Andreas Geiger , Bastian Leibe

Learning Multi-Object Tracking and Segmentation from Automatic Annotations

In this work we contribute a novel pipeline to automatically generate training data, and to improve over state-of-the-art multi-object tracking and segmentation (MOTS) methods. Our proposed track mining algorithm turns raw street-level…

Computer Vision and Pattern Recognition · Computer Science 2020-03-31 Lorenzo Porzi , Markus Hofinger , Idoia Ruiz , Joan Serrat , Samuel Rota Bulò , Peter Kontschieder

Multi-object tracking with self-supervised associating network

Multi-Object Tracking (MOT) is the task that has a lot of potential for development, and there are still many problems to be solved. In the traditional tracking by detection paradigm, There has been a lot of work on feature based object…

Computer Vision and Pattern Recognition · Computer Science 2020-10-27 Tae-young Chung , Heansung Lee , Myeong Ah Cho , Suhwan Cho , Sangyoun Lee

Improving tracking with a tracklet associator

Multiple object tracking (MOT) is a task in computer vision that aims to detect the position of various objects in videos and to associate them to a unique identity. We propose an approach based on Constraint Programming (CP) whose goal is…

Computer Vision and Pattern Recognition · Computer Science 2022-04-25 Rémi Nahon , Guillaume-Alexandre Bilodeau , Gilles Pesant

MeNToS: Tracklets Association with a Space-Time Memory Network

We propose a method for multi-object tracking and segmentation (MOTS) that does not require fine-tuning or per benchmark hyperparameter selection. The proposed method addresses particularly the data association problem. Indeed, the recently…

Computer Vision and Pattern Recognition · Computer Science 2021-07-16 Mehdi Miah , Guillaume-Alexandre Bilodeau , Nicolas Saunier

Appearance-Based Refinement for Object-Centric Motion Segmentation

The goal of this paper is to discover, segment, and track independently moving objects in complex visual scenes. Previous approaches have explored the use of optical flow for motion segmentation, leading to imperfect predictions due to…

Computer Vision and Pattern Recognition · Computer Science 2024-08-20 Junyu Xie , Weidi Xie , Andrew Zisserman

Score refinement for confidence-based 3D multi-object tracking

Multi-object tracking is a critical component in autonomous navigation, as it provides valuable information for decision-making. Many researchers tackled the 3D multi-object tracking task by filtering out the frame-by-frame 3D detections;…

Computer Vision and Pattern Recognition · Computer Science 2021-07-12 Nuri Benbarka , Jona Schröder , Andreas Zell

Self-supervised Video Object Segmentation

The objective of this paper is self-supervised representation learning, with the goal of solving semi-supervised video object segmentation (a.k.a. dense tracking). We make the following contributions: (i) we propose to improve the existing…

Computer Vision and Pattern Recognition · Computer Science 2020-06-23 Fangrui Zhu , Li Zhang , Yanwei Fu , Guodong Guo , Weidi Xie

ReMoT: Reinforcement Learning with Motion Contrast Triplets

We present ReMoT, a unified training paradigm to systematically address the fundamental shortcomings of VLMs in spatio-temporal consistency -- a critical failure point in navigation, robotics, and autonomous driving. ReMoT integrates two…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Cong Wan , Zeyu Guo , Jiangyang Li , SongLin Dong , Yifan Bai , Lin Peng , Zhiheng Ma , Yihong Gong

MOTS: Multiple Object Tracking for General Categories Based On Few-Shot Method

Most modern Multi-Object Tracking (MOT) systems typically apply REID-based paradigm to hold a balance between computational efficiency and performance. In the past few years, numerous attempts have been made to perfect the systems. Although…

Computer Vision and Pattern Recognition · Computer Science 2020-05-20 Xixi Xu , Chao Lu , Liang Zhu , Xiangyang Xue , Guanxian Chen , Qi Guo , Yining Lin , Zhijian Zhao

UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with Geometric Topology Guidance

Object detection has long been a topic of high interest in computer vision literature. Motivated by the fact that annotating data for the multi-object tracking (MOT) problem is immensely expensive, recent studies have turned their attention…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Son Tran , Cong Tran , Anh Tran , Cuong Pham

PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation

We address semi-supervised video object segmentation, the task of automatically generating accurate and consistent pixel masks for objects in a video sequence, given the first-frame ground truth annotations. Towards this goal, we present…

Computer Vision and Pattern Recognition · Computer Science 2018-11-06 Jonathon Luiten , Paul Voigtlaender , Bastian Leibe

SearchTrack: Multiple Object Tracking with Object-Customized Search and Motion-Aware Features

The paper presents a new method, SearchTrack, for multiple object tracking and segmentation (MOTS). To address the association problem between detected objects, SearchTrack proposes object-customized search and motion-aware features. By…

Computer Vision and Pattern Recognition · Computer Science 2022-11-01 Zhong-Min Tsai , Yu-Ju Tsai , Chien-Yao Wang , Hong-Yuan Liao , Youn-Long Lin , Yung-Yu Chuang

Temporal-Enhanced Multimodal Transformer for Referring Multi-Object Tracking and Segmentation

Referring multi-object tracking (RMOT) is an emerging cross-modal task that aims to locate an arbitrary number of target objects and maintain their identities referred by a language expression in a video. This intricate task involves the…

Computer Vision and Pattern Recognition · Computer Science 2024-10-18 Changcheng Xiao , Qiong Cao , Yujie Zhong , Xiang Zhang , Tao Wang , Canqun Yang , Long Lan

PointTrack++ for Effective Online Multi-Object Tracking and Segmentation

Multiple-object tracking and segmentation (MOTS) is a novel computer vision task that aims to jointly perform multiple object tracking (MOT) and instance segmentation. In this work, we present PointTrack++, an effective on-line framework…

Computer Vision and Pattern Recognition · Computer Science 2020-07-06 Zhenbo Xu , Wei Zhang , Xiao Tan , Wei Yang , Xiangbo Su , Yuchen Yuan , Hongwu Zhang , Shilei Wen , Errui Ding , Liusheng Huang

AttMOT: Improving Multiple-Object Tracking by Introducing Auxiliary Pedestrian Attributes

Multi-object tracking (MOT) is a fundamental problem in computer vision with numerous applications, such as intelligent surveillance and automated driving. Despite the significant progress made in MOT, pedestrian attributes, such as gender,…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Yunhao Li , Zhen Xiao , Lin Yang , Dan Meng , Xin Zhou , Heng Fan , Libo Zhang

ReaMOT: A Benchmark and Framework for Reasoning-based Multi-Object Tracking

Referring Multi-Object Tracking (RMOT) aims to track targets specified by language instructions. However, existing RMOT paradigms heavily rely on explicit visual-textual matching and consequently fail to generalize to complex instructions…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Sijia Chen , Yanqiu Yu , En Yu , Wenbing Tao

Weakly-Supervised Referring Video Object Segmentation through Text Supervision

Referring video object segmentation (RVOS) aims to segment the target instance in a video, referred by a text expression. Conventional approaches are mostly supervised learning, requiring expensive pixel-level mask annotations. To tackle…

Computer Vision and Pattern Recognition · Computer Science 2026-04-22 Miaojing Shi , Jun Huang , Zijie Yue , Hanli Wang

TR-MOT: Multi-Object Tracking by Reference

Multi-object Tracking (MOT) generally can be split into two sub-tasks, i.e., detection and association. Many previous methods follow the tracking by detection paradigm, which first obtain detections at each frame and then associate them…

Computer Vision and Pattern Recognition · Computer Science 2022-04-01 Mingfei Chen , Yue Liao , Si Liu , Fei Wang , Jenq-Neng Hwang

Learning Referring Video Object Segmentation from Weak Annotation

Referring video object segmentation (RVOS) is a task that aims to segment the target object in all video frames based on a sentence describing the object. Although existing RVOS methods have achieved significant performance, they depend on…

Computer Vision and Pattern Recognition · Computer Science 2023-12-18 Wangbo Zhao , Kepan Nan , Songyang Zhang , Kai Chen , Dahua Lin , Yang You