English
Related papers

Related papers: Transformer Meets Tracker: Exploiting Temporal Con…

200 papers

In this paper, we present a new tracking architecture with an encoder-decoder transformer as the key component. The encoder models the global spatio-temporal feature dependencies between target objects and search regions, while the decoder…

Computer Vision and Pattern Recognition · Computer Science 2021-04-01 Bin Yan , Houwen Peng , Jianlong Fu , Dong Wang , Huchuan Lu

Generic object tracking remains an important yet challenging task in computer vision due to complex spatio-temporal dynamics, especially in the presence of occlusions, similar distractors, and appearance variations. Over the past two…

Computer Vision and Pattern Recognition · Computer Science 2025-08-01 Fereshteh Aghaee Meibodi , Shadi Alijani , Homayoun Najjaran

Template-based discriminative trackers are currently the dominant tracking methods due to their robustness and accuracy, and the Siamese-network-based methods that depend on cross-correlation operation between features extracted from…

Computer Vision and Pattern Recognition · Computer Science 2021-05-11 Moju Zhao , Kei Okada , Masayuki Inaba

High computational power and significant time are usually needed to train a deep learning based tracker on large datasets. Depending on many factors, training might not always be an option. In this paper, we propose a framework with two…

Computer Vision and Pattern Recognition · Computer Science 2024-10-16 Ali Sekhavati , Won-Sook Lee

The challenging task of multi-object tracking (MOT) requires simultaneous reasoning about track initialization, identity, and spatio-temporal trajectories. We formulate this task as a frame-to-frame set prediction problem and introduce…

Computer Vision and Pattern Recognition · Computer Science 2022-05-02 Tim Meinhardt , Alexander Kirillov , Laura Leal-Taixe , Christoph Feichtenhofer

Recent advances in Siamese network-based visual tracking methods have enabled high performance on numerous tracking benchmarks. However, extensive scale variations of the target object and distractor objects with similar categories have…

Computer Vision and Pattern Recognition · Computer Science 2020-07-15 Janghoon Choi , Junseok Kwon , Kyoung Mu Lee

The unique complementarity of frame-based and event cameras for high frame rate object tracking has recently inspired some research attempts to develop multi-modal fusion approaches. However, these methods directly fuse both modalities and…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Yucheng Chen , Lin Wang

Existing Visual Object Tracking (VOT) only takes the target area in the first frame as a template. This causes tracking to inevitably fail in fast-changing and crowded scenes, as it cannot account for changes in object appearance between…

Computer Vision and Pattern Recognition · Computer Science 2023-03-31 Jin-Peng Lan , Zhi-Qi Cheng , Jun-Yan He , Chenyang Li , Bin Luo , Xu Bao , Wangmeng Xiang , Yifeng Geng , Xuansong Xie

In this work, we propose a novel paradigm to encode the position of targets for target tracking in videos using transformers. The proposed paradigm, Dense Spatio-Temporal (DST) position encoding, encodes spatio-temporal position information…

Computer Vision and Pattern Recognition · Computer Science 2022-10-19 Jinkun Cao , Hao Wu , Kris Kitani

Recently, Siamese network based trackers have received tremendous interest for their fast tracking speed and high performance. Despite the great success, this tracking framework still suffers from several limitations. First, it cannot…

Computer Vision and Pattern Recognition · Computer Science 2018-09-06 Anfeng He , Chong Luo , Xinmei Tian , Wenjun Zeng

Transformer-based visual object tracking has been utilized extensively. However, the Transformer structure is lack of enough inductive bias. In addition, only focusing on encoding the global feature does harm to modeling local details,…

Computer Vision and Pattern Recognition · Computer Science 2022-08-09 Changhong Fu , Weiyu Peng , Sihang Li , Junjie Ye , Ziang Cao

In this paper, we propose a novel on-line visual tracking framework based on the Siamese matching network and meta-learner network, which run at real-time speeds. Conventional deep convolutional feature-based discriminative visual tracking…

Computer Vision and Pattern Recognition · Computer Science 2019-08-19 Janghoon Choi , Junseok Kwon , Kyoung Mu Lee

Recent advances in visual tracking are based on siamese feature extractors and template matching. For this category of trackers, latest research focuses on better feature embeddings and similarity measures. In this work, we focus on…

Computer Vision and Pattern Recognition · Computer Science 2019-08-07 Axel Sauer , Elie Aljalbout , Sami Haddadin

Visual object tracking acts as a pivotal component in various emerging video applications. Despite the numerous developments in visual tracking, existing deep trackers are still likely to fail when tracking against objects with dramatic…

Computer Vision and Pattern Recognition · Computer Science 2022-04-05 Qiuhong Shen , Xin Li , Fanyang Meng , Yongsheng Liang

Siamese network based trackers formulate 3D single object tracking as cross-correlation learning between point features of a template and a search area. Due to the large appearance variation between the template and search area during…

Computer Vision and Pattern Recognition · Computer Science 2022-07-27 Le Hui , Lingpeng Wang , Linghua Tang , Kaihao Lan , Jin Xie , Jian Yang

Tracking transforming objects holds significant importance in various fields due to the dynamic nature of many real-world scenarios. By enabling systems accurately represent transforming objects over time, tracking transforming objects…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 You Wu , Yuelong Wang , Yaxin Liao , Fuliang Wu , Hengzhou Ye , Shuiwang Li

Tracking multiple objects in videos relies on modeling the spatial-temporal interactions of the objects. In this paper, we propose a solution named TransMOT, which leverages powerful graph transformers to efficiently model the spatial and…

Computer Vision and Pattern Recognition · Computer Science 2021-04-06 Peng Chu , Jiang Wang , Quanzeng You , Haibin Ling , Zicheng Liu

Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer…

Computation and Language · Computer Science 2018-10-09 Jiacheng Zhang , Huanbo Luan , Maosong Sun , FeiFei Zhai , Jingfang Xu , Min Zhang , Yang Liu

Text tracking is to track multiple texts in a video,and construct a trajectory for each text. Existing methodstackle this task by utilizing the tracking-by-detection frame-work, i.e., detecting the text instances in each frame…

Computer Vision and Pattern Recognition · Computer Science 2021-12-30 Yuzhe Gao , Xing Li , Jiajian Zhang , Yu Zhou , Dian Jin , Jing Wang , Shenggao Zhu , Xiang Bai

Developing robust and discriminative appearance models has been a long-standing research challenge in visual object tracking. In the prevalent Siamese-based paradigm, the features extracted by the Siamese-like networks are often…

Computer Vision and Pattern Recognition · Computer Science 2024-09-04 Fei Xie , Wankou Yang , Chunyu Wang , Lei Chu , Yue Cao , Chao Ma , Wenjun Zeng
‹ Prev 1 2 3 10 Next ›