Related papers: Contrastive Learning for Multi-Object Tracking wit…

Object Detection with Transformers: A Review

The astounding performance of transformers in natural language processing (NLP) has motivated researchers to explore their applications in computer vision tasks. DEtection TRansformer (DETR) introduces transformers to object detection tasks…

Computer Vision and Pattern Recognition · Computer Science 2023-07-13 Tahira Shehzadi , Khurram Azeem Hashmi , Didier Stricker , Muhammad Zeshan Afzal

Oriented Object Detection with Transformer

Object detection with Transformers (DETR) has achieved a competitive performance over traditional detectors, such as Faster R-CNN. However, the potential of DETR remains largely unexplored for the more challenging task of arbitrary-oriented…

Computer Vision and Pattern Recognition · Computer Science 2021-06-08 Teli Ma , Mingyuan Mao , Honghui Zheng , Peng Gao , Xiaodi Wang , Shumin Han , Errui Ding , Baochang Zhang , David Doermann

End-to-End Object Detection with Transformers

We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression…

Computer Vision and Pattern Recognition · Computer Science 2020-05-29 Nicolas Carion , Francisco Massa , Gabriel Synnaeve , Nicolas Usunier , Alexander Kirillov , Sergey Zagoruyko

Pair DETR: Contrastive Learning Speeds Up DETR Training

The DETR object detection approach applies the transformer encoder and decoder architecture to detect objects and achieves promising performance. In this paper, we present a simple approach to address the main problem of DETR, the slow…

Computer Vision and Pattern Recognition · Computer Science 2022-11-14 Seyed Mehdi Iranmanesh , Xiaotong Chen , Kuo-Chin Lien

Rethinking Transformer-based Set Prediction for Object Detection

DETR is a recently proposed Transformer-based method which views object detection as a set prediction problem and achieves state-of-the-art performance but demands extra-long training time to converge. In this paper, we investigate the…

Computer Vision and Pattern Recognition · Computer Science 2021-10-13 Zhiqing Sun , Shengcao Cao , Yiming Yang , Kris Kitani

Deformable DETR: Deformable Transformers for End-to-End Object Detection

DETR has been recently proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance. However, it suffers from slow convergence and limited feature spatial resolution, due to the…

Computer Vision and Pattern Recognition · Computer Science 2021-03-19 Xizhou Zhu , Weijie Su , Lewei Lu , Bin Li , Xiaogang Wang , Jifeng Dai

FastTrackTr:Towards Fast Multi-Object Tracking with Transformers

Transformer-based multi-object tracking (MOT) methods have captured the attention of many researchers in recent years. However, these models often suffer from slow inference speeds due to their structure or other issues. To address this…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Pan Liao , Feng Yang , Di Wu , Jinwen Yu , Wenhui Zhao , Dingwen Zhang

Motion-Aware Transformer for Multi-Object Tracking

Multi-object tracking (MOT) in videos remains challenging due to complex object motions and crowded scenes. Recent DETR-based frameworks offer end-to-end solutions but typically process detection and tracking queries jointly within a single…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Xu Yang , Gady Agam

Investigating the Robustness and Properties of Detection Transformers (DETR) Toward Difficult Images

Transformer-based object detectors (DETR) have shown significant performance across machine vision tasks, ultimately in object detection. This detector is based on a self-attention mechanism along with the transformer encoder-decoder…

Computer Vision and Pattern Recognition · Computer Science 2023-10-16 Zhao Ning Zou , Yuhang Zhang , Robert Wijaya

MOTR: End-to-End Multiple-Object Tracking with Transformer

Temporal modeling of objects is a key challenge in multiple object tracking (MOT). Existing methods track by associating detections through motion-based and appearance-based similarity heuristics. The post-processing nature of association…

Computer Vision and Pattern Recognition · Computer Science 2022-07-20 Fangao Zeng , Bin Dong , Yuang Zhang , Tiancai Wang , Xiangyu Zhang , Yichen Wei

Miti-DETR: Object Detection based on Transformers with Mitigatory Self-Attention Convergence

Object Detection with Transformers (DETR) and related works reach or even surpass the highly-optimized Faster-RCNN baseline with self-attention network architectures. Inspired by the evidence that pure self-attention possesses a strong…

Computer Vision and Pattern Recognition · Computer Science 2021-12-28 Wenchi Ma , Tianxiao Zhang , Guanghui Wang

Object Detection for Vehicle Dashcams using Transformers

The use of intelligent automation is growing significantly in the automotive industry, as it assists drivers and fleet management companies, thus increasing their productivity. Dash cams are now been used for this purpose which enables the…

Computer Vision and Pattern Recognition · Computer Science 2024-08-29 Osama Mustafa , Khizer Ali , Anam Bibi , Imran Siddiqi , Momina Moetesum

Deformable Attention Mechanisms Applied to Object Detection, case of Remote Sensing

Object detection has recently seen an interesting trend in terms of the most innovative research work, this task being of particular importance in the field of remote sensing, given the consistency of these images in terms of geographical…

Computer Vision and Pattern Recognition · Computer Science 2025-06-02 Anasse Boutayeb , Iyad Lahsen-cherif , Ahmed El Khadimi

MODETR: Moving Object Detection with Transformers

Moving Object Detection (MOD) is a crucial task for the Autonomous Driving pipeline. MOD is usually handled via 2-stream convolutional architectures that incorporates both appearance and motion cues, without considering the inter-relations…

Computer Vision and Pattern Recognition · Computer Science 2021-06-23 Eslam Mohamed , Ahmad El-Sallab

Open World DETR: Transformer based Open World Object Detection

Open world object detection aims at detecting objects that are absent in the object classes of the training data as unknown objects without explicit supervision. Furthermore, the exact classes of the unknown objects must be identified…

Computer Vision and Pattern Recognition · Computer Science 2022-12-07 Na Dong , Yongqiang Zhang , Mingli Ding , Gim Hee Lee

DPDETR: Decoupled Position Detection Transformer for Infrared-Visible Object Detection

Infrared-visible object detection aims to achieve robust object detection by leveraging the complementary information of infrared and visible image pairs. However, the commonly existing modality misalignment problem presents two challenges:…

Computer Vision and Pattern Recognition · Computer Science 2025-10-02 Junjie Guo , Chenqiang Gao , Fangcen Liu , Deyu Meng

RotaTR: Detection Transformer for Dense and Rotated Object

Detecting the objects in dense and rotated scenes is a challenging task. Recent works on this topic are mostly based on Faster RCNN or Retinanet. As they are highly dependent on the pre-set dense anchors and the NMS operation, the approach…

Computer Vision and Pattern Recognition · Computer Science 2023-12-06 Zhu Yuke , Ruan Yumeng , Yang Lei , Guo Sheng

Can Deep Learning be Applied to Model-Based Multi-Object Tracking?

Multi-object tracking (MOT) is the problem of tracking the state of an unknown and time-varying number of objects using noisy measurements, with important applications such as autonomous driving, tracking animal behavior, defense systems,…

Machine Learning · Computer Science 2022-02-17 Juliano Pinto , Georg Hess , William Ljungbergh , Yuxuan Xia , Henk Wymeersch , Lennart Svensson

End-to-End Object Detection with Adaptive Clustering Transformer

End-to-end Object Detection with Transformer (DETR)proposes to perform object detection with Transformer and achieve comparable performance with two-stage object detection like Faster-RCNN. However, DETR needs huge computational resources…

Computer Vision and Pattern Recognition · Computer Science 2021-10-19 Minghang Zheng , Peng Gao , Renrui Zhang , Kunchang Li , Xiaogang Wang , Hongsheng Li , Hao Dong

Language-aware Multiple Datasets Detection Pretraining for DETRs

Pretraining on large-scale datasets can boost the performance of object detectors while the annotated datasets for object detection are hard to scale up due to the high labor cost. What we possess are numerous isolated filed-specific…

Computer Vision and Pattern Recognition · Computer Science 2023-04-10 Jing Hao , Song Chen , Xiaodi Wang , Shumin Han