Related papers: Multi-class Token Transformer for Weakly Supervise…

MCTformer+: Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation

This paper proposes a novel transformer-based framework that aims to enhance weakly supervised semantic segmentation (WSSS) by generating accurate class-specific object localization maps as pseudo labels. Building upon the observation that…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Lian Xu , Mohammed Bennamoun , Farid Boussaid , Hamid Laga , Wanli Ouyang , Dan Xu

Know Your Attention Maps: Class-specific Token Masking for Weakly Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation (WSSS) is a challenging problem that has been extensively studied in recent years. Traditional approaches often rely on external modules like Class Activation Maps to highlight regions of interest and…

Computer Vision and Pattern Recognition · Computer Science 2025-07-10 Joelle Hanna , Damian Borth

Re-Attention Transformer for Weakly Supervised Object Localization

Weakly supervised object localization is a challenging task which aims to localize objects with coarse annotations such as image categories. Existing deep network approaches are mainly based on class activation map, which focuses on…

Computer Vision and Pattern Recognition · Computer Science 2023-02-28 Hui Su , Yue Ye , Zhiwei Chen , Mingli Song , Lechao Cheng

Multiscale Vision Transformer With Deep Clustering-Guided Refinement for Weakly Supervised Object Localization

This work addresses the task of weakly-supervised object localization. The goal is to learn object localization using only image-level class labels, which are much easier to obtain compared to bounding box annotations. This task is…

Computer Vision and Pattern Recognition · Computer Science 2023-12-18 David Kim , Sinhae Cha , Byeongkeun Kang

Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration

Weakly Supervised Object Localization (WSOL), which aims to localize objects by only using image-level labels, has attracted much attention because of its low annotation cost in real applications. Recent studies leverage the advantage of…

Computer Vision and Pattern Recognition · Computer Science 2023-03-13 Haotian Bai , Ruimao Zhang , Jiong Wang , Xiang Wan

Constrained Sampling for Class-Agnostic Weakly Supervised Object Localization

Self-supervised vision transformers can generate accurate localization maps of the objects in an image. However, since they decompose the scene into multiple maps containing various objects, and they do not rely on any explicit supervisory…

Computer Vision and Pattern Recognition · Computer Science 2022-09-20 Shakeeb Murtaza , Soufiane Belharbi , Marco Pedersoli , Aydin Sarraf , Eric Granger

Dual Progressive Transformations for Weakly Supervised Semantic Segmentation

Weakly supervised semantic segmentation (WSSS), which aims to mine the object regions by merely using class-level labels, is a challenging task in computer vision. The current state-of-the-art CNN-based methods usually adopt…

Computer Vision and Pattern Recognition · Computer Science 2022-10-03 Dongjian Huo , Yukun Su , Qingyao Wu

Leveraging Swin Transformer for Local-to-Global Weakly Supervised Semantic Segmentation

In recent years, weakly supervised semantic segmentation using image-level labels as supervision has received significant attention in the field of computer vision. Most existing methods have addressed the challenges arising from the lack…

Computer Vision and Pattern Recognition · Computer Science 2024-03-12 Rozhan Ahmadi , Shohreh Kasaei

WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation

Transformer has been very successful in various computer vision tasks and understanding the working mechanism of transformer is important. As touchstones, weakly-supervised semantic segmentation (WSSS) and class activation map (CAM) are…

Computer Vision and Pattern Recognition · Computer Science 2026-03-26 Lianghui Zhu , Yingyue Li , Jiemin Fang , Yan Liu , Hao Xin , Wenyu Liu , Xinggang Wang

Token Contrast for Weakly-Supervised Semantic Segmentation

Weakly-Supervised Semantic Segmentation (WSSS) using image-level labels typically utilizes Class Activation Map (CAM) to generate the pseudo labels. Limited by the local structure perception of CNN, CAM usually cannot identify the integral…

Computer Vision and Pattern Recognition · Computer Science 2023-03-03 Lixiang Ru , Heliang Zheng , Yibing Zhan , Bo Du

Spatial-Aware Token for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) is a challenging task aiming to localize objects with only image-level supervision. Recent works apply visual transformer to WSOL and achieve significant success by exploiting the long-range…

Computer Vision and Pattern Recognition · Computer Science 2023-08-10 Pingyu Wu , Wei Zhai , Yang Cao , Jiebo Luo , Zheng-Jun Zha

Semantic-Constraint Matching Transformer for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) strives to learn to localize objects with only image-level supervision. Due to the local receptive fields generated by convolution operations, previous CNN-based methods suffer from partial…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Yiwen Cao , Yukun Su , Wenjun Wang , Yanxia Liu , Qingyao Wu

WegFormer: Transformers for Weakly Supervised Semantic Segmentation

Although convolutional neural networks (CNNs) have achieved remarkable progress in weakly supervised semantic segmentation (WSSS), the effective receptive field of CNN is insufficient to capture global context information, leading to…

Computer Vision and Pattern Recognition · Computer Science 2022-03-17 Chunmeng Liu , Enze Xie , Wenjia Wang , Wenhai Wang , Guangyao Li , Ping Luo

TransCAM: Transformer Attention-based CAM Refinement for Weakly Supervised Semantic Segmentation

Weakly supervised semantic segmentation (WSSS) with only image-level supervision is a challenging task. Most existing methods exploit Class Activation Maps (CAM) to generate pixel-level pseudo labels for supervised training. However, due to…

Computer Vision and Pattern Recognition · Computer Science 2023-03-17 Ruiwen Li , Zheda Mai , Chiheb Trabelsi , Zhibo Zhang , Jongseong Jang , Scott Sanner

SemFormer: Semantic Guided Activation Transformer for Weakly Supervised Semantic Segmentation

Recent mainstream weakly supervised semantic segmentation (WSSS) approaches are mainly based on Class Activation Map (CAM) generated by a CNN (Convolutional Neural Network) based image classifier. In this paper, we propose a novel…

Computer Vision and Pattern Recognition · Computer Science 2022-10-27 Junliang Chen , Xiaodong Zhao , Cheng Luo , Linlin Shen

Dual-Augmented Transformer Network for Weakly Supervised Semantic Segmentation

Weakly supervised semantic segmentation (WSSS), a fundamental computer vision task, which aims to segment out the object within only class-level labels. The traditional methods adopt the CNN-based network and utilize the class activation…

Computer Vision and Pattern Recognition · Computer Science 2023-10-03 Jingliang Deng , Zonghan Li

A Self-Training Framework Based on Multi-Scale Attention Fusion for Weakly Supervised Semantic Segmentation

Weakly supervised semantic segmentation (WSSS) based on image-level labels is challenging since it is hard to obtain complete semantic regions. To address this issue, we propose a self-training method that utilizes fused multi-scale…

Computer Vision and Pattern Recognition · Computer Science 2023-05-11 Guoqing Yang , Chuang Zhu , Yu Zhang

Transformer-based Multi-Modal Learning for Multi Label Remote Sensing Image Classification

In this paper, we introduce a novel Synchronized Class Token Fusion (SCT Fusion) architecture in the framework of multi-modal multi-label classification (MLC) of remote sensing (RS) images. The proposed architecture leverages…

Computer Vision and Pattern Recognition · Computer Science 2023-06-05 David Hoffmann , Kai Norman Clasen , Begüm Demir

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer

Text spotting end-to-end methods have recently gained attention in the literature due to the benefits of jointly optimizing the text detection and recognition components. Existing methods usually have a distinct separation between the…

Computer Vision and Pattern Recognition · Computer Science 2022-02-15 Yair Kittenplon , Inbal Lavi , Sharon Fogel , Yarin Bar , R. Manmatha , Pietro Perona

CaFT: Clustering and Filter on Tokens of Transformer for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) is a challenging task to localize the object by only category labels. However, there is contradiction between classification and localization because accurate classification network tends to pay…

Computer Vision and Pattern Recognition · Computer Science 2022-01-04 Ming Li