Related papers: Multi-Granularity Language-Guided Training for Mul…

LaMOT: Language-Guided Multi-Object Tracking

Vision-Language MOT is a crucial tracking problem and has drawn increasing attention recently. It aims to track objects based on human language commands, replacing the traditional use of templates or pre-set information from training sets…

Computer Vision and Pattern Recognition · Computer Science 2024-06-13 Yunhao Li , Xiaoqiong Liu , Luke Liu , Heng Fan , Libo Zhang

IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking

Multi-Object Tracking (MOT) aims to associate multiple objects across video frames and is a challenging vision task due to inherent complexities in the tracking environment. Most existing approaches train and track within a single domain,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-01 Run Luo , Zikai Song , Longze Chen , Yunshui Li , Min Yang , Wei Yang

TP-GMOT: Tracking Generic Multiple Object by Textual Prompt with Motion-Appearance Cost (MAC) SORT

While Multi-Object Tracking (MOT) has made substantial advancements, it is limited by heavy reliance on prior knowledge and limited to predefined categories. In contrast, Generic Multiple Object Tracking (GMOT), tracking multiple objects…

Computer Vision and Pattern Recognition · Computer Science 2024-09-05 Duy Le Dinh Anh , Kim Hoang Tran , Ngan Hoang Le

Awesome Multi-modal Object Tracking

Multi-modal object tracking (MMOT) is an emerging field that combines data from various modalities, \eg vision (RGB), depth, thermal infrared, event, language and audio, to estimate the state of an arbitrary object in a video sequence. It…

Computer Vision and Pattern Recognition · Computer Science 2024-06-03 Chunhui Zhang , Li Liu , Hao Wen , Xi Zhou , Yanfeng Wang

Divert More Attention to Vision-Language Object Tracking

Multimodal vision-language (VL) learning has noticeably pushed the tendency toward generic intelligence owing to emerging large foundation models. However, tracking, as a fundamental vision problem, surprisingly enjoys less bonus from…

Computer Vision and Pattern Recognition · Computer Science 2023-07-20 Mingzhe Guo , Zhipeng Zhang , Liping Jing , Haibin Ling , Heng Fan

IA-MOT: Instance-Aware Multi-Object Tracking with Motion Consistency

Multiple object tracking (MOT) is a crucial task in computer vision society. However, most tracking-by-detection MOT methods, with available detected bounding boxes, cannot effectively handle static, slow-moving and fast-moving camera…

Computer Vision and Pattern Recognition · Computer Science 2020-06-25 Jiarui Cai , Yizhou Wang , Haotian Zhang , Hung-Min Hsu , Chengqian Ma , Jenq-Neng Hwang

GroundingGPT:Language Enhanced Multi-modal Grounding Model

Multi-modal large language models have demonstrated impressive performance across various tasks in different modalities. However, existing multi-modal models primarily emphasize capturing global information within each modality while…

Computer Vision and Pattern Recognition · Computer Science 2024-03-06 Zhaowei Li , Qi Xu , Dong Zhang , Hang Song , Yiqing Cai , Qi Qi , Ran Zhou , Junting Pan , Zefeng Li , Van Tu Vu , Zhida Huang , Tao Wang

Robust Multi-Modality Multi-Object Tracking

Multi-sensor perception is crucial to ensure the reliability and accuracy in autonomous driving system, while multi-object tracking (MOT) improves that by tracing sequential movement of dynamic objects. Most current approaches for…

Computer Vision and Pattern Recognition · Computer Science 2019-09-10 Wenwei Zhang , Hui Zhou , Shuyang Sun , Zhe Wang , Jianping Shi , Chen Change Loy

Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking

Multi-Object Tracking (MOT) remains a vital component of intelligent video analysis, which aims to locate targets and maintain a consistent identity for each target throughout a video sequence. Existing works usually learn a discriminative…

Computer Vision and Pattern Recognition · Computer Science 2023-11-20 Yizhe Li , Sanping Zhou , Zheng Qin , Le Wang , Jinjun Wang , Nanning Zheng

LLMTrack: Semantic Multi-Object Tracking with Multi-modal Large Language Models

Multi-Object Tracking (MOT) is evolving from geometric localization to Semantic MOT (SMOT) to answer complex relational queries, yet progress is hindered by semantic data scarcity and a structural disconnect between tracking architectures…

Computer Vision and Pattern Recognition · Computer Science 2026-03-13 Pan Liao , Feng Yang , Di Wu , Jinwen Yu , Yuhua Zhu , Wenhui Zhao , Dingwen Zhang

VSE-MOT: Multi-Object Tracking in Low-Quality Video Scenes Guided by Visual Semantic Enhancement

Current multi-object tracking (MOT) algorithms typically overlook issues inherent in low-quality videos, leading to significant degradation in tracking performance when confronted with real-world image deterioration. Therefore, advancing…

Computer Vision and Pattern Recognition · Computer Science 2025-09-18 Jun Du , Weiwei Xing , Ming Li , Fei Richard Yu

MOT FCG++: Enhanced Representation of Spatio-temporal Motion and Appearance Features

The goal of multi-object tracking (MOT) is to detect and track all objects in a scene across frames, while maintaining a unique identity for each object. Most existing methods rely on the spatial-temporal motion features and appearance…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Yanzhao Fang

Tell Me What to Track: Infusing Robust Language Guidance for Enhanced Referring Multi-Object Tracking

Referring multi-object tracking (RMOT) is an emerging cross-modal task that aims to localize an arbitrary number of targets based on a language expression and continuously track them in a video. This intricate task involves reasoning on…

Computer Vision and Pattern Recognition · Computer Science 2025-07-28 Wenjun Huang , Yang Ni , Hanning Chen , Yirui He , Ian Bryant , Yezi Liu , Mohsen Imani

Enhanced Kalman with Adaptive Appearance Motion SORT for Grounded Generic Multiple Object Tracking

Despite recent progress, Multi-Object Tracking (MOT) continues to face significant challenges, particularly its dependence on prior knowledge and predefined categories, complicating the tracking of unfamiliar objects. Generic Multiple…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Duy Le Dinh Anh , Kim Hoang Tran , Quang-Thuc Nguyen , Ngan Hoang Le

Deep OC-SORT: Multi-Pedestrian Tracking by Adaptive Re-Identification

Motion-based association for Multi-Object Tracking (MOT) has recently re-achieved prominence with the rise of powerful object detectors. Despite this, little work has been done to incorporate appearance cues beyond simple heuristic models…

Computer Vision and Pattern Recognition · Computer Science 2023-02-24 Gerard Maggiolino , Adnan Ahmad , Jinkun Cao , Kris Kitani

IMM-MOT: A Novel 3D Multi-object Tracking Framework with Interacting Multiple Model Filter

3D Multi-Object Tracking (MOT) provides the trajectories of surrounding objects, assisting robots or vehicles in smarter path planning and obstacle avoidance. Existing 3D MOT methods based on the Tracking-by-Detection framework typically…

Computer Vision and Pattern Recognition · Computer Science 2025-02-17 Xiaohong Liu , Xulong Zhao , Gang Liu , Zili Wu , Tao Wang , Lei Meng , Yuhan Wang

OVTrack: Open-Vocabulary Multiple Object Tracking

The ability to recognize, localize and track dynamic objects in a scene is fundamental to many real-world applications, such as self-driving and robotic systems. Yet, traditional multiple object tracking (MOT) benchmarks rely only on a few…

Computer Vision and Pattern Recognition · Computer Science 2023-04-18 Siyuan Li , Tobias Fischer , Lei Ke , Henghui Ding , Martin Danelljan , Fisher Yu

Vision-Motion-Reference Alignment for Referring Multi-Object Tracking via Multi-Modal Large Language Models

Referring Multi-Object Tracking (RMOT) extends conventional multi-object tracking (MOT) by introducing natural language references for multi-modal fusion tracking. RMOT benchmarks only describe the object's appearance, relative positions,…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Weiyi Lv , Ning Zhang , Hanyang Sun , Haoran Jiang , Kai Zhao , Jing Xiao , Dan Zeng

UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with Geometric Topology Guidance

Object detection has long been a topic of high interest in computer vision literature. Motivated by the fact that annotating data for the multi-object tracking (MOT) problem is immensely expensive, recent studies have turned their attention…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Son Tran , Cong Tran , Anh Tran , Cuong Pham

ORMOT: A Dataset and Framework for Omnidirectional Referring Multi-Object Tracking

Multi-Object Tracking (MOT) is a fundamental task in computer vision, aiming to track targets across video frames. Existing MOT methods perform well in general visual scenes, but face significant challenges and limitations when extended to…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Sijia Chen , Zihan Zhou , Yanqiu Yu , En Yu , Wenbing Tao