Related papers: Task-Specific Context Decoupling for Object Detect…

Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection

The introduction of DETR represents a new paradigm for object detection. However, its decoder conducts classification and box localization using shared queries and cross-attention layers, leading to suboptimal results. We observe that…

Computer Vision and Pattern Recognition · Computer Science 2023-10-25 Manyuan Zhang , Guanglu Song , Yu Liu , Hongsheng Li

Revisiting the Sibling Head in Object Detector

The ``shared head for classification and localization'' (sibling head), firstly denominated in Fast RCNN~\cite{girshick2015fast}, has been leading the fashion of the object detection community in the past five years. This paper provides the…

Computer Vision and Pattern Recognition · Computer Science 2020-03-18 Guanglu Song , Yu Liu , Xiaogang Wang

Rethinking Classification and Localization for Object Detection

Two head structures (i.e. fully connected head and convolution head) have been widely used in R-CNN based detectors for classification and localization tasks. However, there is a lack of understanding of how does these two head structures…

Computer Vision and Pattern Recognition · Computer Science 2020-04-06 Yue Wu , Yinpeng Chen , Lu Yuan , Zicheng Liu , Lijuan Wang , Hongzhi Li , Yun Fu

Decoupling Localization and Classification in Single Shot Temporal Action Detection

Video temporal action detection aims to temporally localize and recognize the action in untrimmed videos. Existing one-stage approaches mostly focus on unifying two subtasks, i.e., localization of action proposals and classification of each…

Computer Vision and Pattern Recognition · Computer Science 2019-04-17 Yupan Huang , Qi Dai , Yutong Lu

Cross-Supervised Object Detection

After learning a new object category from image-level annotations (with no object bounding boxes), humans are remarkably good at precisely localizing those objects. However, building good object localizers (i.e., detectors) currently…

Computer Vision and Pattern Recognition · Computer Science 2020-06-30 Zitian Chen , Zhiqiang Shen , Jiahui Yu , Erik Learned-Miller

Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation

This paper focus on few-shot object detection~(FSOD) and instance segmentation~(FSIS), which requires a model to quickly adapt to novel classes with a few labeled instances. The existing methods severely suffer from bias classification…

Computer Vision and Pattern Recognition · Computer Science 2025-05-21 Bin-Bin Gao , Xiaochen Chen , Zhongyi Huang , Congchong Nie , Jun Liu , Jinxiang Lai , Guannan Jiang , Xi Wang , Chengjie Wang

TOOD: Task-aligned One-stage Object Detection

One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions…

Computer Vision and Pattern Recognition · Computer Science 2021-08-31 Chengjian Feng , Yujie Zhong , Yu Gao , Matthew R. Scott , Weilin Huang

CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection

Task driven object detection aims to detect object instances suitable for affording a task in an image. Its challenge lies in object categories available for the task being too diverse to be limited to a closed set of object vocabulary for…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Jiajin Tang , Ge Zheng , Jingyi Yu , Sibei Yang

DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception

Dense visual prediction tasks have been constrained by their reliance on predefined categories, limiting their applicability in real-world scenarios where visual concepts are unbounded. While Vision-Language Models (VLMs) like CLIP have…

Computer Vision and Pattern Recognition · Computer Science 2025-05-08 Junjie Wang , Bin Chen , Yulin Li , Bin Kang , Yichi Chen , Zhuotao Tian

Instance Localization for Self-supervised Detection Pretraining

Prior research on self-supervised learning has led to considerable progress on image classification, but often with degraded transfer performance on object detection. The objective of this paper is to advance self-supervised pretrained…

Computer Vision and Pattern Recognition · Computer Science 2021-04-07 Ceyuan Yang , Zhirong Wu , Bolei Zhou , Stephen Lin

Learning Fixation Point Strategy for Object Detection and Classification

We propose a novel recurrent attentional structure to localize and recognize objects jointly. The network can learn to extract a sequence of local observations with detailed appearance and rough context, instead of sliding windows or…

Computer Vision and Pattern Recognition · Computer Science 2017-12-20 Jie Lyu , Zejian Yuan , Dapeng Chen

Hybrid Task Cascade for Instance Segmentation

Cascade is a classic yet powerful architecture that has boosted performance on various tasks. However, how to introduce cascade to instance segmentation remains an open question. A simple combination of Cascade R-CNN and Mask R-CNN only…

Computer Vision and Pattern Recognition · Computer Science 2019-04-10 Kai Chen , Jiangmiao Pang , Jiaqi Wang , Yu Xiong , Xiaoxiao Li , Shuyang Sun , Wansen Feng , Ziwei Liu , Jianping Shi , Wanli Ouyang , Chen Change Loy , Dahua Lin

Decoupled Adaptation for Cross-Domain Object Detection

Cross-domain object detection is more challenging than object classification since multiple objects exist in an image and the location of each object is unknown in the unlabeled target domain. As a result, when we adapt features of…

Computer Vision and Pattern Recognition · Computer Science 2022-05-10 Junguang Jiang , Baixu Chen , Jianmin Wang , Mingsheng Long

Seeing the Unseen: Mask-Driven Positional Encoding and Strip-Convolution Context Modeling for Cross-View Object Geo-Localization

Cross-view object geo-localization enables high-precision object localization through cross-view matching, with critical applications in autonomous driving, urban management, and disaster response. However, existing methods rely on…

Computer Vision and Pattern Recognition · Computer Science 2025-10-24 Shuhan Hu , Yiru Li , Yuanyuan Li , Yingying Zhu

With Great Context Comes Great Prediction Power: Classifying Objects via Geo-Semantic Scene Graphs

Humans effortlessly identify objects by leveraging a rich understanding of the surrounding scene, including spatial relationships, material properties, and the co-occurrence of other objects. In contrast, most computational object…

Computer Vision and Pattern Recognition · Computer Science 2025-12-30 Ciprian Constantinescu , Marius Leordeanu

Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis

We investigate the mechanistic underpinnings of in-context learning (ICL) in large language models by reconciling two dominant perspectives: the component-level analysis of attention heads and the holistic decomposition of ICL into Task…

Computation and Language · Computer Science 2026-05-04 Haolin Yang , Hakaze Cho , Naoya Inoue

Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation

Weakly supervised semantic segmentation (WSSS) with image-level labels aims to achieve segmentation tasks without dense annotations. However, attributed to the frequent coupling of co-occurring objects and the limited supervision from…

Computer Vision and Pattern Recognition · Computer Science 2024-03-22 Zhiwei Yang , Kexue Fu , Minghong Duan , Linhao Qu , Shuo Wang , Zhijian Song

Seamless Detection: Unifying Salient Object Detection and Camouflaged Object Detection

Achieving joint learning of Salient Object Detection (SOD) and Camouflaged Object Detection (COD) is extremely challenging due to their distinct object characteristics, i.e., saliency and camouflage. The only preliminary research treats…

Computer Vision and Pattern Recognition · Computer Science 2025-03-26 Yi Liu , Chengxin Li , Xiaohui Dong , Lei Li , Dingwen Zhang , Shoukun Xu , Jungong Han

Improving Object Detection and Attribute Recognition by Feature Entanglement Reduction

We explore object detection with two attributes: color and material. The task aims to simultaneously detect objects and infer their color and material. A straight-forward approach is to add attribute heads at the very end of a usual object…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Zhaoheng Zheng , Arka Sadhu , Ram Nevatia

A Simple Framework for Open-Vocabulary Segmentation and Detection

We present OpenSeeD, a simple Open-vocabulary Segmentation and Detection framework that jointly learns from different segmentation and detection datasets. To bridge the gap of vocabulary and annotation granularity, we first introduce a…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Hao Zhang , Feng Li , Xueyan Zou , Shilong Liu , Chunyuan Li , Jianfeng Gao , Jianwei Yang , Lei Zhang