Related papers: General Compression Framework for Efficient Transf…

An Efficient Token Compression Framework for Visual Object Tracking

Refining visual representations by eliminating their internal feature-level redundancy is crucial for simultaneously optimizing the performance and computational cost of models in visual tracking. To enhance their performance, many…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Weijing Wu , Qihua Liang , Bineng Zhong , Haiying Xia , Zhiyi Mo , Shuxiang Song

Exploring Dynamic Transformer for Efficient Object Tracking

The speed-precision trade-off is a critical problem for visual object tracking which usually requires low latency and deployment on constrained resources. Existing solutions for efficient tracking mainly focus on adopting light-weight…

Computer Vision and Pattern Recognition · Computer Science 2025-04-04 Jiawen Zhu , Xin Chen , Haiwen Diao , Shuai Li , Jun-Yan He , Chenyang Li , Bin Luo , Dong Wang , Huchuan Lu

ZoomTrack: Target-aware Non-uniform Resizing for Efficient Visual Tracking

Recently, the transformer has enabled the speed-oriented trackers to approach state-of-the-art (SOTA) performance with high-speed thanks to the smaller input size or the lighter feature extraction backbone, though they still substantially…

Computer Vision and Pattern Recognition · Computer Science 2023-10-17 Yutong Kou , Jin Gao , Bing Li , Gang Wang , Weiming Hu , Yizheng Wang , Liang Li

Balancing Specialization, Generalization, and Compression for Detection and Tracking

We propose a method for specializing deep detectors and trackers to restricted settings. Our approach is designed with the following goals in mind: (a) Improving accuracy in restricted domains; (b) preventing overfitting to new domains and…

Computer Vision and Pattern Recognition · Computer Science 2019-09-26 Dotan Kaufman , Koby Bibas , Eran Borenstein , Michael Chertok , Tal Hassner

Transforming Model Prediction for Tracking

Optimization based tracking methods have been widely successful by integrating a target model prediction module, providing effective global reasoning by minimizing an objective function. While this inductive bias integrates valuable domain…

Computer Vision and Pattern Recognition · Computer Science 2022-03-22 Christoph Mayer , Martin Danelljan , Goutam Bhat , Matthieu Paul , Danda Pani Paudel , Fisher Yu , Luc Van Gool

A Fast Transformer-based General-Purpose Lossless Compressor

Deep-learning-based compressor has received interests recently due to much improved compression ratio. However, modern approaches suffer from long execution time. To ease this problem, this paper targets on cutting down the execution time…

Machine Learning · Computer Science 2022-04-04 Yu Mao , Yufei Cui , Tei-Wei Kuo , Chun Jason Xue

Efficient Training for Visual Tracking with Deformable Transformer

Recent Transformer-based visual tracking models have showcased superior performance. Nevertheless, prior works have been resource-intensive, requiring prolonged GPU training hours and incurring high GFLOPs during inference due to…

Computer Vision and Pattern Recognition · Computer Science 2023-09-07 Qingmao Wei , Guotian Zeng , Bi Zeng

SwinTrack: A Simple and Strong Baseline for Transformer Tracking

Recently Transformer has been largely explored in tracking and shown state-of-the-art (SOTA) performance. However, existing efforts mainly focus on fusing and enhancing features generated by convolutional neural networks (CNNs). The…

Computer Vision and Pattern Recognition · Computer Science 2023-03-24 Liting Lin , Heng Fan , Zhipeng Zhang , Yong Xu , Haibin Ling

Global Tracking Transformers

We present a novel transformer-based architecture for global multi-object tracking. Our network takes a short sequence of frames as input and produces global trajectories for all objects. The core component is a global tracking transformer…

Computer Vision and Pattern Recognition · Computer Science 2022-04-27 Xingyi Zhou , Tianwei Yin , Vladlen Koltun , Philipp Krähenbühl

DecoderTracker: Decoder-Only Method for Multiple-Object Tracking

Decoder-only methods, such as GPT, have demonstrated superior performance in many areas compared to traditional encoder-decoder structure transformer methods. Over the years, end-to-end methods based on the traditional transformer…

Computer Vision and Pattern Recognition · Computer Science 2025-07-10 Liao Pan , Yang Feng , Zhao Wenhui , Yua Jinwen , Zhang Dingwen

SUTrack: Towards Simple and Unified Single Object Tracking

In this paper, we propose a simple yet unified single object tracking (SOT) framework, dubbed SUTrack. It consolidates five SOT tasks (RGB-based, RGB-Depth, RGB-Thermal, RGB-Event, RGB-Language Tracking) into a unified model trained in a…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Xin Chen , Ben Kang , Wanting Geng , Jiawen Zhu , Yi Liu , Dong Wang , Huchuan Lu

Efficient Joint Detection and Multiple Object Tracking with Spatially Aware Transformer

We propose a light-weight and highly efficient Joint Detection and Tracking pipeline for the task of Multi-Object Tracking using a fully-transformer architecture. It is a modified version of TransTrack, which overcomes the computational…

Computer Vision and Pattern Recognition · Computer Science 2022-11-11 Siddharth Sagar Nijhawan , Leo Hoshikawa , Atsushi Irie , Masakazu Yoshimura , Junji Otsuka , Takeshi Ohashi

Efficient Visual Tracking with Exemplar Transformers

The design of more complex and powerful neural network models has significantly advanced the state-of-the-art in visual object tracking. These advances can be attributed to deeper networks, or the introduction of new building blocks, such…

Computer Vision and Pattern Recognition · Computer Science 2022-10-05 Philippe Blatter , Menelaos Kanakis , Martin Danelljan , Luc Van Gool

Learning Spatio-Temporal Transformer for Visual Tracking

In this paper, we present a new tracking architecture with an encoder-decoder transformer as the key component. The encoder models the global spatio-temporal feature dependencies between target objects and search regions, while the decoder…

Computer Vision and Pattern Recognition · Computer Science 2021-04-01 Bin Yan , Houwen Peng , Jianlong Fu , Dong Wang , Huchuan Lu

Embedding Compression for Teacher-to-Student Knowledge Transfer

Common knowledge distillation methods require the teacher model and the student model to be trained on the same task. However, the usage of embeddings as teachers has also been proposed for different source tasks and target tasks. Prior…

Machine Learning · Computer Science 2024-02-13 Yiwei Ding , Alexander Lerch

UETrack: A Unified and Efficient Framework for Single Object Tracking

With growing real-world demands, efficient tracking has received increasing attention. However, most existing methods are limited to RGB inputs and struggle in multi-modal scenarios. Moreover, current multi-modal tracking approaches…

Computer Vision and Pattern Recognition · Computer Science 2026-03-04 Ben Kang , Jie Zhao , Xin Chen , Wanting Geng , Bin Zhang , Lu Zhang , Dong Wang , Huchuan Lu

TAPTR: Tracking Any Point with Transformers as Detection

In this paper, we propose a simple and strong framework for Tracking Any Point with TRansformers (TAPTR). Based on the observation that point tracking bears a great resemblance to object detection and tracking, we borrow designs from…

Computer Vision and Pattern Recognition · Computer Science 2024-03-21 Hongyang Li , Hao Zhang , Shilong Liu , Zhaoyang Zeng , Tianhe Ren , Feng Li , Lei Zhang

FastTrackTr:Towards Fast Multi-Object Tracking with Transformers

Transformer-based multi-object tracking (MOT) methods have captured the attention of many researchers in recent years. However, these models often suffer from slow inference speeds due to their structure or other issues. To address this…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Pan Liao , Feng Yang , Di Wu , Jinwen Yu , Wenhui Zhao , Dingwen Zhang

FocusTrack: One-Stage Focus-and-Suppress Framework for 3D Point Cloud Object Tracking

In 3D point cloud object tracking, the motion-centric methods have emerged as a promising avenue due to its superior performance in modeling inter-frame motion. However, existing two-stage motion-based approaches suffer from fundamental…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Sifan Zhou , Jiahao Nie , Ziyu Zhao , Yichao Cao , Xiaobo Lu

Progressive Scaling Visual Object Tracking

In this work, we propose a progressive scaling training strategy for visual object tracking, systematically analyzing the influence of training data volume, model size, and input resolution on tracking performance. Our empirical study…

Computer Vision and Pattern Recognition · Computer Science 2025-05-29 Jack Hong , Shilin Yan , Zehao Xiao , Jiayin Cai , Xiaolong Jiang , Yao Hu , Henghui Ding