Related papers: Transformer Meets Tracker: Exploiting Temporal Con…

Learning Spatio-Temporal Transformer for Visual Tracking

In this paper, we present a new tracking architecture with an encoder-decoder transformer as the key component. The encoder models the global spatio-temporal feature dependencies between target objects and search regions, while the decoder…

Computer Vision and Pattern Recognition · Computer Science 2021-04-01 Bin Yan , Houwen Peng , Jianlong Fu , Dong Wang , Huchuan Lu

A Deep Dive into Generic Object Tracking: A Survey

Generic object tracking remains an important yet challenging task in computer vision due to complex spatio-temporal dynamics, especially in the presence of occlusions, similar distractors, and appearance variations. Over the past two…

Computer Vision and Pattern Recognition · Computer Science 2025-08-01 Fereshteh Aghaee Meibodi , Shadi Alijani , Homayoun Najjaran

TrTr: Visual Tracking with Transformer

Template-based discriminative trackers are currently the dominant tracking methods due to their robustness and accuracy, and the Siamese-network-based methods that depend on cross-correlation operation between features extracted from…

Computer Vision and Pattern Recognition · Computer Science 2021-05-11 Moju Zhao , Kei Okada , Masayuki Inaba

Improving Siamese Based Trackers with Light or No Training through Multiple Templates and Temporal Network

High computational power and significant time are usually needed to train a deep learning based tracker on large datasets. Depending on many factors, training might not always be an option. In this paper, we propose a framework with two…

Computer Vision and Pattern Recognition · Computer Science 2024-10-16 Ali Sekhavati , Won-Sook Lee

TrackFormer: Multi-Object Tracking with Transformers

The challenging task of multi-object tracking (MOT) requires simultaneous reasoning about track initialization, identity, and spatio-temporal trajectories. We formulate this task as a frame-to-frame set prediction problem and introduce…

Computer Vision and Pattern Recognition · Computer Science 2022-05-02 Tim Meinhardt , Alexander Kirillov , Laura Leal-Taixe , Christoph Feichtenhofer

Visual Tracking by TridentAlign and Context Embedding

Recent advances in Siamese network-based visual tracking methods have enabled high performance on numerous tracking benchmarks. However, extensive scale variations of the target object and distractor objects with similar categories have…

Computer Vision and Pattern Recognition · Computer Science 2020-07-15 Janghoon Choi , Junseok Kwon , Kyoung Mu Lee

eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking

The unique complementarity of frame-based and event cameras for high frame rate object tracking has recently inspired some research attempts to develop multi-modal fusion approaches. However, these methods directly fuse both modalities and…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Yucheng Chen , Lin Wang

ProContEXT: Exploring Progressive Context Transformer for Tracking

Existing Visual Object Tracking (VOT) only takes the target area in the first frame as a template. This causes tracking to inevitably fail in fast-changing and crowded scenes, as it cannot account for changes in object appearance between…

Computer Vision and Pattern Recognition · Computer Science 2023-03-31 Jin-Peng Lan , Zhi-Qi Cheng , Jun-Yan He , Chenyang Li , Bin Luo , Xu Bao , Wangmeng Xiang , Yifeng Geng , Xuansong Xie

Track Targets by Dense Spatio-Temporal Position Encoding

In this work, we propose a novel paradigm to encode the position of targets for target tracking in videos using transformers. The proposed paradigm, Dense Spatio-Temporal (DST) position encoding, encodes spatio-temporal position information…

Computer Vision and Pattern Recognition · Computer Science 2022-10-19 Jinkun Cao , Hao Wu , Kris Kitani

Towards a Better Match in Siamese Network Based Visual Object Tracker

Recently, Siamese network based trackers have received tremendous interest for their fast tracking speed and high performance. Despite the great success, this tracking framework still suffers from several limitations. First, it cannot…

Computer Vision and Pattern Recognition · Computer Science 2018-09-06 Anfeng He , Chong Luo , Xinmei Tian , Wenjun Zeng

Local Perception-Aware Transformer for Aerial Tracking

Transformer-based visual object tracking has been utilized extensively. However, the Transformer structure is lack of enough inductive bias. In addition, only focusing on encoding the global feature does harm to modeling local details,…

Computer Vision and Pattern Recognition · Computer Science 2022-08-09 Changhong Fu , Weiyu Peng , Sihang Li , Junjie Ye , Ziang Cao

Deep Meta Learning for Real-Time Target-Aware Visual Tracking

In this paper, we propose a novel on-line visual tracking framework based on the Siamese matching network and meta-learner network, which run at real-time speeds. Conventional deep convolutional feature-based discriminative visual tracking…

Computer Vision and Pattern Recognition · Computer Science 2019-08-19 Janghoon Choi , Junseok Kwon , Kyoung Mu Lee

Tracking Holistic Object Representations

Recent advances in visual tracking are based on siamese feature extractors and template matching. For this category of trackers, latest research focuses on better feature embeddings and similarity measures. In this work, we focus on…

Computer Vision and Pattern Recognition · Computer Science 2019-08-07 Axel Sauer , Elie Aljalbout , Sami Haddadin

Context-aware Visual Tracking with Joint Meta-updating

Visual object tracking acts as a pivotal component in various emerging video applications. Despite the numerous developments in visual tracking, existing deep trackers are still likely to fail when tracking against objects with dramatic…

Computer Vision and Pattern Recognition · Computer Science 2022-04-05 Qiuhong Shen , Xin Li , Fanyang Meng , Yongsheng Liang

3D Siamese Transformer Network for Single Object Tracking on Point Clouds

Siamese network based trackers formulate 3D single object tracking as cross-correlation learning between point features of a template and a search area. Due to the large appearance variation between the template and search area during…

Computer Vision and Pattern Recognition · Computer Science 2022-07-27 Le Hui , Lingpeng Wang , Linghua Tang , Kaihao Lan , Jin Xie , Jian Yang

Tracking Transforming Objects: A Benchmark

Tracking transforming objects holds significant importance in various fields due to the dynamic nature of many real-world scenarios. By enabling systems accurately represent transforming objects over time, tracking transforming objects…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 You Wu , Yuelong Wang , Yaxin Liao , Fuliang Wu , Hengzhou Ye , Shuiwang Li

TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking

Tracking multiple objects in videos relies on modeling the spatial-temporal interactions of the objects. In this paper, we propose a solution named TransMOT, which leverages powerful graph transformers to efficiently model the spatial and…

Computer Vision and Pattern Recognition · Computer Science 2021-04-06 Peng Chu , Jiang Wang , Quanzeng You , Haibin Ling , Zicheng Liu

Improving the Transformer Translation Model with Document-Level Context

Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer…

Computation and Language · Computer Science 2018-10-09 Jiacheng Zhang , Huanbo Luan , Maosong Sun , FeiFei Zhai , Jingfang Xu , Min Zhang , Yang Liu

Video Text Tracking With a Spatio-Temporal Complementary Model

Text tracking is to track multiple texts in a video,and construct a trajectory for each text. Existing methodstackle this task by utilizing the tracking-by-detection frame-work, i.e., detecting the text instances in each frame…

Computer Vision and Pattern Recognition · Computer Science 2021-12-30 Yuzhe Gao , Xing Li , Jiajian Zhang , Yu Zhou , Dian Jin , Jing Wang , Shenggao Zhu , Xiang Bai

Correlation-Embedded Transformer Tracking: A Single-Branch Framework

Developing robust and discriminative appearance models has been a long-standing research challenge in visual object tracking. In the prevalent Siamese-based paradigm, the features extracted by the Siamese-like networks are often…

Computer Vision and Pattern Recognition · Computer Science 2024-09-04 Fei Xie , Wankou Yang , Chunyu Wang , Lei Chu , Yue Cao , Chao Ma , Wenjun Zeng