English

Zero3D: Semantic-Driven Multi-Category 3D Shape Generation

Computer Vision and Pattern Recognition 2023-11-15 v5 Multimedia

Abstract

Semantic-driven 3D shape generation aims to generate 3D objects conditioned on text. Previous works face problems with single-category generation, low-frequency 3D details, and requiring a large number of paired datasets for training. To tackle these challenges, we propose a multi-category conditional diffusion model. Specifically, 1) to alleviate the problem of lack of large-scale paired data, we bridge the text, 2D image and 3D shape based on the pre-trained CLIP model, and 2) to obtain the multi-category 3D shape feature, we apply the conditional flow model to generate 3D shape vector conditioned on CLIP embedding. 3) to generate multi-category 3D shape, we employ the hidden-layer diffusion model conditioned on the multi-category shape vector, which greatly reduces the training time and memory consumption.

Keywords

Cite

@article{arxiv.2301.13591,
  title  = {Zero3D: Semantic-Driven Multi-Category 3D Shape Generation},
  author = {Bo Han and Yitong Fu and Yixuan Shen},
  journal= {arXiv preprint arXiv:2301.13591},
  year   = {2023}
}

Comments

work in progress

R2 v1 2026-06-28T08:27:56.460Z