Related papers: Towards Data-Efficient Detection Transformers

DETR++: Taming Your Multi-Scale Detection Transformer

Convolutional Neural Networks (CNN) have dominated the field of detection ever since the success of AlexNet in ImageNet classification [12]. With the sweeping reform of Transformers [27] in natural language processing, Carion et al. [2]…

Computer Vision and Pattern Recognition · Computer Science 2022-06-08 Chi Zhang , Lijuan Liu , Xiaoxue Zang , Frederick Liu , Hao Zhang , Xinying Song , Jindong Chen

Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images

This paper takes an important step in bridging the performance gap between DETR and R-CNN for graphical object detection. Existing graphical object detection approaches have enjoyed recent enhancements in CNN-based object detection methods,…

Computer Vision and Pattern Recognition · Computer Science 2023-06-26 Tahira Shehzadi , Khurram Azeem Hashmi , Didier Stricker , Marcus Liwicki , Muhammad Zeshan Afzal

Investigating the Robustness and Properties of Detection Transformers (DETR) Toward Difficult Images

Transformer-based object detectors (DETR) have shown significant performance across machine vision tasks, ultimately in object detection. This detector is based on a self-attention mechanism along with the transformer encoder-decoder…

Computer Vision and Pattern Recognition · Computer Science 2023-10-16 Zhao Ning Zou , Yuhang Zhang , Robert Wijaya

Less is More: Focus Attention for Efficient DETR

DETR-like models have significantly boosted the performance of detectors and even outperformed classical convolutional models. However, all tokens are treated equally without discrimination brings a redundant computational burden in the…

Computer Vision and Pattern Recognition · Computer Science 2023-07-25 Dehua Zheng , Wenhui Dong , Hailin Hu , Xinghao Chen , Yunhe Wang

Deformable DETR: Deformable Transformers for End-to-End Object Detection

DETR has been recently proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance. However, it suffers from slow convergence and limited feature spatial resolution, due to the…

Computer Vision and Pattern Recognition · Computer Science 2021-03-19 Xizhou Zhu , Weijie Su , Lewei Lu , Bin Li , Xiaogang Wang , Jifeng Dai

Q-DETR: An Efficient Low-Bit Quantized Detection Transformer

The recent detection transformer (DETR) has advanced object detection, but its application on resource-constrained devices requires massive computation and memory resources. Quantization stands out as a solution by representing the network…

Computer Vision and Pattern Recognition · Computer Science 2023-04-04 Sheng Xu , Yanjing Li , Mingbao Lin , Peng Gao , Guodong Guo , Jinhu Lu , Baochang Zhang

Learning A Sparse Transformer Network for Effective Image Deraining

Transformers-based methods have achieved significant performance in image deraining as they can model the non-local information which is vital for high-quality image reconstruction. In this paper, we find that most existing Transformers…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Xiang Chen , Hao Li , Mingqiang Li , Jinshan Pan

Object Detection with Transformers: A Review

The astounding performance of transformers in natural language processing (NLP) has motivated researchers to explore their applications in computer vision tasks. DEtection TRansformer (DETR) introduces transformers to object detection tasks…

Computer Vision and Pattern Recognition · Computer Science 2023-07-13 Tahira Shehzadi , Khurram Azeem Hashmi , Didier Stricker , Muhammad Zeshan Afzal

PaQ-DETR: Learning Pattern and Quality-Aware Dynamic Queries for Object Detection

Detection Transformer (DETR) has redefined object detection by casting it as a set prediction task within an end-to-end framework. Despite its elegance, DETR and its variants still rely on fixed learnable queries and suffer from severe…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Zhengjian Kang , Jun Zhuang , Kangtong Mo , Qi Chen , Rui Liu , Ye Zhang

Knowledge Distillation via Query Selection for Detection Transformer

Transformers have revolutionized the object detection landscape by introducing DETRs, acclaimed for their simplicity and efficacy. Despite their advantages, the substantial size of these models poses significant challenges for practical…

Computer Vision and Pattern Recognition · Computer Science 2024-09-11 Yi Liu , Luting Wang , Zongheng Tang , Yue Liao , Yifan Sun , Lijun Zhang , Si Liu

Visual Transformer for Object Detection

Convolutional Neural networks (CNN) have been the first choice of paradigm in many computer vision applications. The convolution operation however has a significant weakness which is it only operates on a local neighborhood of pixels, thus…

Computer Vision and Pattern Recognition · Computer Science 2022-06-14 Michael Yang

End-to-End Object Detection with Transformers

We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression…

Computer Vision and Pattern Recognition · Computer Science 2020-05-29 Nicolas Carion , Francisco Massa , Gabriel Synnaeve , Nicolas Usunier , Alexander Kirillov , Sergey Zagoruyko

Le-DETR: Revisiting Real-Time Detection Transformer with Efficient Encoder Design

Real-time object detection is crucial for real-world applications as it requires high accuracy with low latency. While Detection Transformers (DETR) have demonstrated significant performance improvements, current real-time DETR models are…

Computer Vision and Pattern Recognition · Computer Science 2026-02-25 Jiannan Huang , Aditya Kane , Fengzhe Zhou , Yunchao Wei , Humphrey Shi

PnP-DETR: Towards Efficient Visual Analysis with Transformers

Recently, DETR pioneered the solution of vision tasks with transformers, it directly translates the image feature map into the object detection result. Though effective, translating the full feature map can be costly due to redundant…

Computer Vision and Pattern Recognition · Computer Science 2022-03-03 Tao Wang , Li Yuan , Yunpeng Chen , Jiashi Feng , Shuicheng Yan

Towards Few-Annotation Learning for Object Detection: Are Transformer-based Models More Efficient ?

For specialized and dense downstream tasks such as object detection, labeling data requires expertise and can be very expensive, making few-shot and semi-supervised models much more attractive alternatives. While in the few-shot setup we…

Computer Vision and Pattern Recognition · Computer Science 2023-11-01 Quentin Bouniot , Angélique Loesch , Romaric Audigier , Amaury Habrard

LoFTR: Detector-Free Local Feature Matching with Transformers

We present a novel method for local image feature matching. Instead of performing image feature detection, description, and matching sequentially, we propose to first establish pixel-wise dense matches at a coarse level and later refine the…

Computer Vision and Pattern Recognition · Computer Science 2021-04-02 Jiaming Sun , Zehong Shen , Yuang Wang , Hujun Bao , Xiaowei Zhou

SO-DETR: Leveraging Dual-Domain Features and Knowledge Distillation for Small Object Detection

Detection Transformer-based methods have achieved significant advancements in general object detection. However, challenges remain in effectively detecting small objects. One key difficulty is that existing encoders struggle to efficiently…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 Huaxiang Zhang , Hao Zhang , Aoran Mei , Zhongxue Gan , Guo-Niu Zhu

DRCT: Saving Image Super-resolution away from Information Bottleneck

In recent years, Vision Transformer-based approaches for low-level vision tasks have achieved widespread success. Unlike CNN-based models, Transformers are more adept at capturing long-range dependencies, enabling the reconstruction of…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Chih-Chung Hsu , Chia-Ming Lee , Yi-Shiuan Chou

Small Object Detection by DETR via Information Augmentation and Adaptive Feature Fusion

The main challenge for small object detection algorithms is to ensure accuracy while pursuing real-time performance. The RT-DETR model performs well in real-time object detection, but performs poorly in small object detection accuracy. In…

Computer Vision and Pattern Recognition · Computer Science 2024-01-17 Ji Huang , Hui Wang

Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR

Recent DEtection TRansformer-based (DETR) models have obtained remarkable performance. Its success cannot be achieved without the re-introduction of multi-scale feature fusion in the encoder. However, the excessively increased tokens in…

Computer Vision and Pattern Recognition · Computer Science 2023-03-14 Feng Li , Ailing Zeng , Shilong Liu , Hao Zhang , Hongyang Li , Lei Zhang , Lionel M. Ni