English

Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning

Computer Vision and Pattern Recognition 2024-07-03 v1

Abstract

The existing contrastive learning methods mainly focus on single-grained representation learning, e.g., part-level, object-level or scene-level ones, thus inevitably neglecting the transferability of representations on other granularity levels. In this paper, we aim to learn multi-grained representations, which can effectively describe the image on various granularity levels, thus improving generalization on extensive downstream tasks. To this end, we propose a novel Multi-Grained Contrast method (MGC) for unsupervised representation learning. Specifically, we construct delicate multi-grained correspondences between positive views and then conduct multi-grained contrast by the correspondences to learn more general unsupervised representations. Without pretrained on large-scale dataset, our method significantly outperforms the existing state-of-the-art methods on extensive downstream tasks, including object detection, instance segmentation, scene parsing, semantic segmentation and keypoint detection. Moreover, experimental results support the data-efficient property and excellent representation transferability of our method. The source code and trained weights are available at \url{https://github.com/visresearch/mgc}.

Keywords

Cite

@article{arxiv.2407.02014,
  title  = {Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning},
  author = {Chengchao Shen and Jianzhong Chen and Jianxin Wang},
  journal= {arXiv preprint arXiv:2407.02014},
  year   = {2024}
}