English

Diff3DETR:Agent-based Diffusion Model for Semi-supervised 3D Object Detection

Computer Vision and Pattern Recognition 2024-08-02 v1

Abstract

3D object detection is essential for understanding 3D scenes. Contemporary techniques often require extensive annotated training data, yet obtaining point-wise annotations for point clouds is time-consuming and laborious. Recent developments in semi-supervised methods seek to mitigate this problem by employing a teacher-student framework to generate pseudo-labels for unlabeled point clouds. However, these pseudo-labels frequently suffer from insufficient diversity and inferior quality. To overcome these hurdles, we introduce an Agent-based Diffusion Model for Semi-supervised 3D Object Detection (Diff3DETR). Specifically, an agent-based object query generator is designed to produce object queries that effectively adapt to dynamic scenes while striking a balance between sampling locations and content embedding. Additionally, a box-aware denoising module utilizes the DDIM denoising process and the long-range attention in the transformer decoder to refine bounding boxes incrementally. Extensive experiments on ScanNet and SUN RGB-D datasets demonstrate that Diff3DETR outperforms state-of-the-art semi-supervised 3D object detection methods.

Keywords

Cite

@article{arxiv.2408.00286,
  title  = {Diff3DETR:Agent-based Diffusion Model for Semi-supervised 3D Object Detection},
  author = {Jiacheng Deng and Jiahao Lu and Tianzhu Zhang},
  journal= {arXiv preprint arXiv:2408.00286},
  year   = {2024}
}

Comments

Accepted to ECCV2024