Related papers: CARD: Semantic Segmentation with Efficient Class-A…

CAR: Class-aware Regularizations for Semantic Segmentation

Recent segmentation methods, such as OCR and CPNet, utilizing "class level" information in addition to pixel features, have achieved notable success for boosting the accuracy of existing network modules. However, the extracted class-level…

Computer Vision and Pattern Recognition · Computer Science 2022-07-15 Ye Huang , Di Kang , Liang Chen , Xuefei Zhe , Wenjing Jia , Xiangjian He , Linchao Bao

Confusion-Aware Spectral Regularizer for Long-Tailed Recognition

Long-tailed image classification remains a long-standing challenge, as real-world data typically follow highly imbalanced distributions where a few head classes dominate and many tail classes contain only limited samples. This imbalance…

Computational Engineering, Finance, and Science · Computer Science 2026-03-18 Ziquan Zhu , Gaojie Jin , Hanruo Zhu , Si-Yuan Lu , Yunxiao Zhang , Zeyu Fu , Ronghui Mu , Guoqiang Zhang , Zhao Sun , Xia Yuhang , Jiaxing Shang , Xiang Li , Lu Liu , Tianjin Huang

Semantic Context Matters: Improving Conditioning for Autoregressive Models

Recently, autoregressive (AR) models have shown strong potential in image generation, offering better scalability and easier integration with unified multi-modal systems compared to diffusion-based methods. However, extending AR models to…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Dongyang Jin , Ryan Xu , Jianhao Zeng , Rui Lan , Yancheng Bai , Lei Sun , Xiangxiang Chu

Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning

One of the ultimate goals of representation learning is to achieve compactness within a class and well-separability between classes. Many outstanding metric-based and prototype-based methods following the Expectation-Maximization paradigm,…

Computer Vision and Pattern Recognition · Computer Science 2024-02-06 Yanqi Ge , Qiang Nie , Ye Huang , Yong Liu , Chengjie Wang , Feng Zheng , Wen Li , Lixin Duan

Learning structure-aware semantic segmentation with image-level supervision

Compared with expensive pixel-wise annotations, image-level labels make it possible to learn semantic segmentation in a weakly-supervised manner. Within this pipeline, the class activation map (CAM) is obtained and further processed to…

Computer Vision and Pattern Recognition · Computer Science 2022-01-06 Jiawei Liu , Jing Zhang , Yicong Hong , Nick Barnes

CART: Compositional Auto-Regressive Transformer for Image Generation

We propose a novel Auto-Regressive (AR) image generation approach that models images as hierarchical compositions of interpretable visual layers. While AR models have achieved transformative success in language modeling, replicating this…

Computer Vision and Pattern Recognition · Computer Science 2025-11-13 Siddharth Roheda , Rohit Chowdhury , Aniruddha Bala , Rohan Jaiswal

CARD: Non-Uniform Quantization of Visual Semantic Unit for Generative Recommendation

Generative recommendation frameworks typically represent items as discrete Semantic IDs (SIDs). While existing studies have sought to enhance SID construction by incorporating multimodal content, collaborative signals, or more advanced…

Information Retrieval · Computer Science 2026-04-30 Yibiao Wei , Jie Zou , Pengfei Zhang , Xiao Ao , Weikang Guo , Zeyu Ma , Yang Yang

CCC++: Optimized Color Classified Colorization with Segment Anything Model (SAM) Empowered Object Selective Color Harmonization

In this paper, we formulate the colorization problem into a multinomial classification problem and then apply a weighted function to classes. We propose a set of formulas to transform color values into color classes and vice versa. To…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Mrityunjoy Gain , Avi Deb Raha , Rameswar Debnath

Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition

Scene text recognition is a rapidly developing field that faces numerous challenges due to the complexity and diversity of scene text, including complex backgrounds, diverse fonts, flexible arrangements, and accidental occlusions. In this…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Mingkun Yang , Biao Yang , Minghui Liao , Yingying Zhu , Xiang Bai

Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation

Class activation map (CAM) has been widely used to highlight image regions that contribute to class predictions. Despite its simplicity and computational efficiency, CAM often struggles to identify discriminative regions that distinguish…

Computer Vision and Pattern Recognition · Computer Science 2025-04-01 Ziheng Zhang , Jianyang Gu , Arpita Chowdhury , Zheda Mai , David Carlyn , Tanya Berger-Wolf , Yu Su , Wei-Lun Chao

Distribution-aware Noisy-label Crack Segmentation

Road crack segmentation is critical for robotic systems tasked with the inspection, maintenance, and monitoring of road infrastructures. Existing deep learning-based methods for crack segmentation are typically trained on specific datasets,…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Xiaoyan Jiang , Xinlong Wan , Kaiying Zhu , Xihe Qiu , Zhijun Fang

CONDA: Continual Unsupervised Domain Adaptation Learning in Visual Perception for Self-Driving Cars

Although unsupervised domain adaptation methods have achieved remarkable performance in semantic scene segmentation in visual perception for self-driving cars, these approaches remain impractical in real-world use cases. In practice, the…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Thanh-Dat Truong , Pierce Helton , Ahmed Moustafa , Jackson David Cothren , Khoa Luu

SCALER: SAM-Enhanced Collaborative Learning for Label-Deficient Concealed Object Segmentation

Existing methods for label-deficient concealed object segmentation (LDCOS) either rely on consistency constraints or Segment Anything Model (SAM)-based pseudo-labeling. However, their performance remains limited due to the intrinsic…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Chunming He , Rihan Zhang , Longxiang Tang , Ziyun Yang , Kai Li , Deng-Ping Fan , Sina Farsiu

Segment Any Crack: Deep Semantic Segmentation Adaptation for Crack Detection

Image-based crack detection algorithms are increasingly in demand in infrastructure monitoring, as early detection of cracks is of paramount importance for timely maintenance planning. While deep learning has significantly advanced crack…

Computer Vision and Pattern Recognition · Computer Science 2025-04-22 Ghodsiyeh Rostami , Po-Han Chen , Mahdi S. Hosseini

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes

During the last half decade, convolutional neural networks (CNNs) have triumphed over semantic segmentation, which is one of the core tasks in many applications such as autonomous driving. However, to train CNNs requires a considerable…

Computer Vision and Pattern Recognition · Computer Science 2018-11-15 Yang Zhang , Philip David , Boqing Gong

CORA: Consistency-Guided Semi-Supervised Framework for Reasoning Segmentation

Reasoning segmentation seeks pixel-accurate masks for targets referenced by complex, often implicit instructions, requiring context-dependent reasoning over the scene. Recent multimodal language models have advanced instruction following…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Prantik Howlader , Hoang Nguyen-Canh , Srijan Das , Jingyi Xu , Hieu Le , Dimitris Samaras

A Curriculum Domain Adaptation Approach to the Semantic Segmentation of Urban Scenes

During the last half decade, convolutional neural networks (CNNs) have triumphed over semantic segmentation, which is one of the core tasks in many applications such as autonomous driving and augmented reality. However, to train CNNs…

Computer Vision and Pattern Recognition · Computer Science 2019-01-11 Yang Zhang , Philip David , Hassan Foroosh , Boqing Gong

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition

Image classification, which classifies images by pre-defined categories, has been the dominant approach to visual representation learning over the last decade. Visual learning through image-text alignment, however, has emerged to show…

Computer Vision and Pattern Recognition · Computer Science 2022-04-25 Yixuan Wei , Yue Cao , Zheng Zhang , Zhuliang Yao , Zhenda Xie , Han Hu , Baining Guo

All-pairs Consistency Learning for Weakly Supervised Semantic Segmentation

In this work, we propose a new transformer-based regularization to better localize objects for Weakly supervised semantic segmentation (WSSS). In image-level WSSS, Class Activation Map (CAM) is adopted to generate object localization as…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Weixuan Sun , Yanhao Zhang , Zhen Qin , Zheyuan Liu , Lin Cheng , Fanyi Wang , Yiran Zhong , Nick Barnes

SCAR: State-Space Compression for Scalable AI-Based Network Management of Vehicular Services

The increasing demand for connected vehicular services poses significant challenges for AI-based network and service management due to the high volume and rapid variability of network state information. Traditional management and control…

Machine Learning · Computer Science 2026-02-03 Ioan-Sorin Comsa , Purav Shah , Karthik Vaidhyanathan , Deepak Gangadharan , Christof Imhof , Per Bergamin , Aryan Kaushik , Gabriel-Miro Muntean , Ramona Trestian