English

PatchContrast: Self-Supervised Pre-training for 3D Object Detection

Computer Vision and Pattern Recognition 2025-04-15 v2

Abstract

Accurately detecting objects in the environment is a key challenge for autonomous vehicles. However, obtaining annotated data for detection is expensive and time-consuming. We introduce PatchContrast, a novel self-supervised point cloud pre-training framework for 3D object detection. We propose to utilize two levels of abstraction to learn discriminative representation from unlabeled data: proposal-level and patch-level. The proposal-level aims at localizing objects in relation to their surroundings, whereas the patch-level adds information about the internal connections between the object's components, hence distinguishing between different objects based on their individual components. We demonstrate how these levels can be integrated into self-supervised pre-training for various backbones to enhance the downstream 3D detection task. We show that our method outperforms existing state-of-the-art models on three commonly-used 3D detection datasets.

Keywords

Cite

@article{arxiv.2308.06985,
  title  = {PatchContrast: Self-Supervised Pre-training for 3D Object Detection},
  author = {Oren Shrout and Ori Nizan and Yizhak Ben-Shabat and Ayellet Tal},
  journal= {arXiv preprint arXiv:2308.06985},
  year   = {2025}
}

Comments

CVPRW 2025

R2 v1 2026-06-28T11:54:54.903Z