English
Related papers

Related papers: Learning Spatio-Temporal Transformer for Visual Tr…

200 papers

In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers. In this work, we bridge the individual video frames and explore the temporal contexts across them…

Computer Vision and Pattern Recognition · Computer Science 2021-03-25 Ning Wang , Wengang Zhou , Jie Wang , Houqaing Li

Transformer-based visual object tracking has been utilized extensively. However, the Transformer structure is lack of enough inductive bias. In addition, only focusing on encoding the global feature does harm to modeling local details,…

Computer Vision and Pattern Recognition · Computer Science 2022-08-09 Changhong Fu , Weiyu Peng , Sihang Li , Junjie Ye , Ziang Cao

In this work, we propose a novel paradigm to encode the position of targets for target tracking in videos using transformers. The proposed paradigm, Dense Spatio-Temporal (DST) position encoding, encodes spatio-temporal position information…

Computer Vision and Pattern Recognition · Computer Science 2022-10-19 Jinkun Cao , Hao Wu , Kris Kitani

We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression…

Computer Vision and Pattern Recognition · Computer Science 2020-05-29 Nicolas Carion , Francisco Massa , Gabriel Synnaeve , Nicolas Usunier , Alexander Kirillov , Sergey Zagoruyko

The recent trend in multiple object tracking (MOT) is heading towards leveraging deep learning to boost the tracking performance. In this paper, we propose a novel solution named TransSTAM, which leverages Transformer to effectively model…

Computer Vision and Pattern Recognition · Computer Science 2022-06-01 Peng Dai , Yiqiang Feng , Renliang Weng , Changshui Zhang

Current object detection approaches predict bounding boxes, but these provide little instance-specific information beyond location, scale and aspect ratio. In this work, we propose to directly regress to objects' shapes in addition to their…

Computer Vision and Pattern Recognition · Computer Science 2017-07-06 Saumya Jetley , Michael Sapienza , Stuart Golodetz , Philip H. S. Torr

In machine learning, effective modeling requires a holistic consideration of how to encode inputs, make predictions (i.e., decoding), and train the model. However, in time-series forecasting, prior work has predominantly focused on encoder…

Machine Learning · Computer Science 2025-12-30 Jaebin Lee , Hankook Lee

The strong demand of autonomous driving in the industry has lead to strong interest in 3D object detection and resulted in many excellent 3D object detection algorithms. However, the vast majority of algorithms only model single-frame data,…

Computer Vision and Pattern Recognition · Computer Science 2020-11-30 Zhenxun Yuan , Xiao Song , Lei Bai , Wengang Zhou , Zhe Wang , Wanli Ouyang

Template-based discriminative trackers are currently the dominant tracking methods due to their robustness and accuracy, and the Siamese-network-based methods that depend on cross-correlation operation between features extracted from…

Computer Vision and Pattern Recognition · Computer Science 2021-05-11 Moju Zhao , Kei Okada , Masayuki Inaba

The challenging task of multi-object tracking (MOT) requires simultaneous reasoning about track initialization, identity, and spatio-temporal trajectories. We formulate this task as a frame-to-frame set prediction problem and introduce…

Computer Vision and Pattern Recognition · Computer Science 2022-05-02 Tim Meinhardt , Alexander Kirillov , Laura Leal-Taixe , Christoph Feichtenhofer

The success of visual tracking has been largely driven by datasets with manual box annotations. However, these box annotations require tremendous human effort, limiting the scale and diversity of existing tracking datasets. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2025-07-30 Yaozong Zheng , Bineng Zhong , Qihua Liang , Ning Li , Shuxiang Song

We propose the task Future Object Detection, in which the goal is to predict the bounding boxes for all visible objects in a future video frame. While this task involves recognizing temporal and kinematic patterns, in addition to the…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Adam Tonderski , Joakim Johnander , Christoffer Petersson , Kalle Åström

Visual object tracking is the problem of predicting a target object's state in a video. Generally, bounding-boxes have been used to represent states, and a surge of effort has been spent by the community to produce efficient causal…

Computer Vision and Pattern Recognition · Computer Science 2021-02-02 Matteo Dunnhofer , Niki Martinel , Christian Micheloni

Recent Transformer-based visual tracking models have showcased superior performance. Nevertheless, prior works have been resource-intensive, requiring prolonged GPU training hours and incurring high GFLOPs during inference due to…

Computer Vision and Pattern Recognition · Computer Science 2023-09-07 Qingmao Wei , Guotian Zeng , Bi Zeng

We propose ST-DETR, a Spatio-Temporal Transformer-based architecture for object detection from a sequence of temporal frames. We treat the temporal frames as sequences in both space and time and employ the full attention mechanisms to take…

Computer Vision and Pattern Recognition · Computer Science 2021-07-27 Eslam Mohamed , Ahmad El-Sallab

In this paper, we propose a deep learning based vehicle trajectory prediction technique which can generate the future trajectory sequence of surrounding vehicles in real time. We employ the encoder-decoder architecture which analyzes the…

Machine Learning · Computer Science 2018-10-23 Seong Hyeon Park , ByeongDo Kim , Chang Mook Kang , Chung Choo Chung , Jun Won Choi

Multi-Object Tracking (MOT) is a critical problem in computer vision, essential for understanding how objects move and interact in videos. This field faces significant challenges such as occlusions and complex environmental dynamics,…

Computer Vision and Pattern Recognition · Computer Science 2025-02-10 Luiz C. S. de Araujo , Carlos M. S. Figueiredo

Fast appearance variations and the distractions of similar objects are two of the most challenging problems in visual object tracking. Unlike many existing trackers that focus on modeling only the target, in this work, we consider the…

Computer Vision and Pattern Recognition · Computer Science 2020-08-28 Bi Li , Chengquan Zhang , Zhibin Hong , Xu Tang , Jingtuo Liu , Junyu Han , Errui Ding , Wenyu Liu

Recently, DETR and Deformable DETR have been proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance as previous complex hand-crafted detectors. However, their performance on…

Computer Vision and Pattern Recognition · Computer Science 2021-05-25 Lu He , Qianyu Zhou , Xiangtai Li , Li Niu , Guangliang Cheng , Xiao Li , Wenxuan Liu , Yunhai Tong , Lizhuang Ma , Liqing Zhang

Existing visual object tracking usually learns a bounding-box based template to match the targets across frames, which cannot accurately learn a pixel-wise representation, thereby being limited in handling severe appearance variations. To…

Computer Vision and Pattern Recognition · Computer Science 2021-04-07 Fei Xie , Wankou Yang , Bo Liu , Kaihua Zhang , Wanli Xue , Wangmeng Zuo
‹ Prev 1 2 3 10 Next ›