Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model

Ruoxi Shi; Hansheng Chen; Zhuoyang Zhang; Minghua Liu; Chao Xu; Xinyue Wei; Linghao Chen; Chong Zeng; Hao Su

Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model

Computer Vision and Pattern Recognition 2023-10-24 v1 Graphics

Authors: Ruoxi Shi , Hansheng Chen , Zhuoyang Zhang , Minghua Liu , Chao Xu , Xinyue Wei , Linghao Chen , Chong Zeng , Hao Su

View on arXiv ↗ PDF ↗

Abstract

We report Zero123++, an image-conditioned diffusion model for generating 3D-consistent multi-view images from a single input view. To take full advantage of pretrained 2D generative priors, we develop various conditioning and training schemes to minimize the effort of finetuning from off-the-shelf image diffusion models such as Stable Diffusion. Zero123++ excels in producing high-quality, consistent multi-view images from a single image, overcoming common issues like texture degradation and geometric misalignment. Furthermore, we showcase the feasibility of training a ControlNet on Zero123++ for enhanced control over the generation process. The code is available at https://github.com/SUDO-AI-3D/zero123plus.

Keywords

novel view synthesis text-to-3d generation image generation

Cite

@article{arxiv.2310.15110,
  title  = {Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model},
  author = {Ruoxi Shi and Hansheng Chen and Zhuoyang Zhang and Minghua Liu and Chao Xu and Xinyue Wei and Linghao Chen and Chong Zeng and Hao Su},
  journal= {arXiv preprint arXiv:2310.15110},
  year   = {2023}
}

Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model

Abstract

Keywords

Cite

Related papers