Related papers: Multi-label Cluster Discrimination for Visual Repr…

Multimodal Multilabel Classification by CLIP

Multimodal multilabel classification (MMC) is a challenging task that aims to design a learning algorithm to handle two data sources, the image and text, and learn a comprehensive semantic feature presentation across the modalities. In this…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Yanming Guo

CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination

Contrastive Language-Image Pre-training (CLIP) has achieved excellent performance over a wide range of tasks. However, the effectiveness of CLIP heavily relies on a substantial corpus of pre-training data, resulting in notable consumption…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Kaicheng Yang , Tiancheng Gu , Xiang An , Haiqiang Jiang , Xiangzi Dai , Ziyong Feng , Weidong Cai , Jiankang Deng

DiffCLIP: Few-shot Language-driven Multimodal Classifier

Visual language models like Contrastive Language-Image Pretraining (CLIP) have shown impressive performance in analyzing natural images with language information. However, these models often encounter challenges when applied to specialized…

Computer Vision and Pattern Recognition · Computer Science 2024-12-11 Jiaqing Zhang , Mingxiang Cao , Xue Yang , Kai Jiang , Yunsong Li

Contrastive Localized Language-Image Pre-Training

Contrastive Language-Image Pre-training (CLIP) has been a celebrated method for training vision encoders to generate image/text representations facilitating various applications. Recently, CLIP has been widely adopted as the vision backbone…

Computer Vision and Pattern Recognition · Computer Science 2025-02-20 Hong-You Chen , Zhengfeng Lai , Haotian Zhang , Xinze Wang , Marcin Eichner , Keen You , Meng Cao , Bowen Zhang , Yinfei Yang , Zhe Gan

Deep Multiview Clustering by Contrasting Cluster Assignments

Multiview clustering (MVC) aims to reveal the underlying structure of multiview data by categorizing data samples into clusters. Deep learning-based methods exhibit strong feature learning capabilities on large-scale datasets. For most…

Computer Vision and Pattern Recognition · Computer Science 2024-01-30 Jie Chen , Hua Mao , Wai Lok Woo , Xi Peng

CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification

This paper presents a CLIP-based unsupervised learning method for annotation-free multi-label image classification, including three stages: initialization, training, and inference. At the initialization stage, we take full advantage of the…

Computer Vision and Pattern Recognition · Computer Science 2024-03-08 Rabab Abdelfattah , Qing Guo , Xiaoguang Li , Xiaofeng Wang , Song Wang

Multi-level Supervised Contrastive Learning

Contrastive learning is a well-established paradigm in representation learning. The standard framework of contrastive learning minimizes the distance between "similar" instances and maximizes the distance between dissimilar ones in the…

Machine Learning · Computer Science 2025-02-06 Naghmeh Ghanooni , Barbod Pajoum , Harshit Rawal , Sophie Fellenz , Vo Nguyen Le Duy , Marius Kloft

Deep Clustering by Semantic Contrastive Learning

Whilst contrastive learning has recently brought notable benefits to deep clustering of unlabelled images by learning sample-specific discriminative visual features, its potential for explicitly inferring class decision boundaries is less…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Jiabo Huang , Shaogang Gong

CLLD: Contrastive Learning with Label Distance for Text Classification

Existed pre-trained models have achieved state-of-the-art performance on various text classification tasks. These models have proven to be useful in learning universal language representations. However, the semantic discrepancy between…

Machine Learning · Computer Science 2022-01-07 Jinhe Lan , Qingyuan Zhan , Chenhao Jiang , Kunping Yuan , Desheng Wang

ProbMCL: Simple Probabilistic Contrastive Learning for Multi-label Visual Classification

Multi-label image classification presents a challenging task in many domains, including computer vision and medical imaging. Recent advancements have introduced graph-based and transformer-based methods to improve performance and capture…

Computer Vision and Pattern Recognition · Computer Science 2024-04-15 Ahmad Sajedi , Samir Khaki , Yuri A. Lawryshyn , Konstantinos N. Plataniotis

Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification

Multi-label classification is crucial for comprehensive image understanding, yet acquiring accurate annotations is challenging and costly. To address this, a recent study suggests exploiting unsupervised multi-label classification…

Computer Vision and Pattern Recognition · Computer Science 2025-03-24 Dongseob Kim , Hyunjung Shim

Unicom: Universal and Compact Representation Learning for Image Retrieval

Modern image retrieval methods typically rely on fine-tuning pre-trained encoders to extract image-level descriptors. However, the most widely used models are pre-trained on ImageNet-1K with limited classes. The pre-trained feature…

Computer Vision and Pattern Recognition · Computer Science 2023-04-13 Xiang An , Jiankang Deng , Kaicheng Yang , Jaiwei Li , Ziyong Feng , Jia Guo , Jing Yang , Tongliang Liu

Multi-Label Image Classification with Contrastive Learning

Recently, as an effective way of learning latent representations, contrastive learning has been increasingly popular and successful in various domains. The success of constrastive learning in single-label classifications motivates us to…

Computer Vision and Pattern Recognition · Computer Science 2021-07-27 Son D. Dao , Ethan Zhao , Dinh Phung , Jianfei Cai

Transductive CLIP with Class-Conditional Contrastive Learning

Inspired by the remarkable zero-shot generalization capacity of vision-language pre-trained model, we seek to leverage the supervision from CLIP model to alleviate the burden of data labeling. However, such supervision inevitably contains…

Computer Vision and Pattern Recognition · Computer Science 2022-06-14 Junchu Huang , Weijie Chen , Shicai Yang , Di Xie , Shiliang Pu , Yueting Zhuang

Label Structure Preserving Contrastive Embedding for Multi-Label Learning with Missing Labels

Contrastive learning (CL) has shown impressive advances in image representation learning in whichever supervised multi-class classification or unsupervised learning. However, these CL methods fail to be directly adapted to multi-label image…

Computer Vision and Pattern Recognition · Computer Science 2022-09-07 Zhongchen Ma , Lisha Li , Qirong Mao , Songcan Chen

CLIP-Decoder : ZeroShot Multilabel Classification using Multimodal CLIP Aligned Representation

Multi-label classification is an essential task utilized in a wide variety of real-world applications. Multi-label zero-shot learning is a method for classifying images into multiple unseen categories for which no training data is…

Computer Vision and Pattern Recognition · Computer Science 2024-06-24 Muhammad Ali , Salman Khan

Rethinking Multiple Instance Learning for Whole Slide Image Classification: A Good Instance Classifier is All You Need

Weakly supervised whole slide image classification is usually formulated as a multiple instance learning (MIL) problem, where each slide is treated as a bag, and the patches cut out of it are treated as instances. Existing methods either…

Computer Vision and Pattern Recognition · Computer Science 2024-05-14 Linhao Qu , Yingfan Ma , Xiaoyuan Luo , Manning Wang , Zhijian Song

ENCLIP: Ensembling and Clustering-Based Contrastive Language-Image Pretraining for Fashion Multimodal Search with Limited Data and Low-Quality Images

Multimodal search has revolutionized the fashion industry, providing a seamless and intuitive way for users to discover and explore fashion items. Based on their preferences, style, or specific attributes, users can search for products by…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Prithviraj Purushottam Naik , Rohit Agarwal

Dual-Level Cross-Modal Contrastive Clustering

Image clustering, which involves grouping images into different clusters without labels, is a key task in unsupervised learning. Although previous deep clustering methods have achieved remarkable results, they only explore the intrinsic…

Computer Vision and Pattern Recognition · Computer Science 2024-09-23 Haixin Zhang , Yongjun Li , Dong Huang

The Solution for Language-Enhanced Image New Category Discovery

Treating texts as images, combining prompts with textual labels for prompt tuning, and leveraging the alignment properties of CLIP have been successfully applied in zero-shot multi-label image recognition. Nonetheless, relying solely on…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 Haonan Xu , Dian Chao , Xiangyu Wu , Zhonghua Wan , Yang Yang