Related papers: ViewDiff: 3D-Consistent Image Generation with Text…

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis

Recent strides in Text-to-3D techniques have been propelled by distilling knowledge from powerful large text-to-image diffusion models (LDMs). Nonetheless, existing Text-to-3D approaches often grapple with challenges such as…

Computer Vision and Pattern Recognition · Computer Science 2023-08-23 Yiwen Chen , Chi Zhang , Xiaofeng Yang , Zhongang Cai , Gang Yu , Lei Yang , Guosheng Lin

DreamFusion: Text-to-3D using 2D Diffusion

Recent breakthroughs in text-to-image synthesis have been driven by diffusion models trained on billions of image-text pairs. Adapting this approach to 3D synthesis would require large-scale datasets of labeled 3D data and efficient…

Computer Vision and Pattern Recognition · Computer Science 2022-09-30 Ben Poole , Ajay Jain , Jonathan T. Barron , Ben Mildenhall

ET3D: Efficient Text-to-3D Generation via Multi-View Distillation

Recent breakthroughs in text-to-image generation has shown encouraging results via large generative models. Due to the scarcity of 3D assets, it is hardly to transfer the success of text-to-image generation to that of text-to-3D generation.…

Computer Vision and Pattern Recognition · Computer Science 2023-11-28 Yiming Chen , Zhiqi Li , Peidong Liu

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Text-guided diffusion models have shown superior performance in image/video generation and editing. While few explorations have been performed in 3D scenarios. In this paper, we discuss three fundamental and interesting problems on this…

Computer Vision and Pattern Recognition · Computer Science 2023-10-13 Gang Li , Heliang Zheng , Chaoyue Wang , Chang Li , Changwen Zheng , Dacheng Tao

Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors

Most 3D generation research focuses on up-projecting 2D foundation models into the 3D space, either by minimizing 2D Score Distillation Sampling (SDS) loss or fine-tuning on multi-view datasets. Without explicit 3D priors, these methods…

Computer Vision and Pattern Recognition · Computer Science 2023-12-11 Lihe Ding , Shaocong Dong , Zhanpeng Huang , Zibin Wang , Yiyuan Zhang , Kaixiong Gong , Dan Xu , Tianfan Xue

Text-Image Conditioned Diffusion for Consistent Text-to-3D Generation

By lifting the pre-trained 2D diffusion models into Neural Radiance Fields (NeRFs), text-to-3D generation methods have made great progress. Many state-of-the-art approaches usually apply score distillation sampling (SDS) to optimize the…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Yuze He , Yushi Bai , Matthieu Lin , Jenny Sheng , Yubin Hu , Qi Wang , Yu-Hui Wen , Yong-Jin Liu

IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation

Most text-to-3D generators build upon off-the-shelf text-to-image models trained on billions of images. They use variants of Score Distillation Sampling (SDS), which is slow, somewhat unstable, and prone to artifacts. A mitigation is to…

Computer Vision and Pattern Recognition · Computer Science 2024-02-14 Luke Melas-Kyriazi , Iro Laina , Christian Rupprecht , Natalia Neverova , Andrea Vedaldi , Oran Gafni , Filippos Kokkinos

Controllable 3D Object Generation with Single Image Prompt

Recently, the impressive generative capabilities of diffusion models have been demonstrated, producing images with remarkable fidelity. Particularly, existing methods for the 3D object generation tasks, which is one of the fastest-growing…

Computer Vision and Pattern Recognition · Computer Science 2025-12-01 Jaeseok Lee , Jaekoo Lee

Guide3D: Create 3D Avatars from Text and Image Guidance

Recently, text-to-image generation has exhibited remarkable advancements, with the ability to produce visually impressive results. In contrast, text-to-3D generation has not yet reached a comparable level of quality. Existing methods…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Yukang Cao , Yan-Pei Cao , Kai Han , Ying Shan , Kwan-Yee K. Wong

Text-driven Visual Synthesis with Latent Diffusion Prior

There has been tremendous progress in large-scale text-to-image synthesis driven by diffusion models enabling versatile downstream applications such as 3D object synthesis from texts, image editing, and customized generation. We present a…

Computer Vision and Pattern Recognition · Computer Science 2023-04-05 Ting-Hsuan Liao , Songwei Ge , Yiran Xu , Yao-Chih Lee , Badour AlBahar , Jia-Bin Huang

HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation

In this paper, we study Text-to-3D content generation leveraging 2D diffusion priors to enhance the quality and detail of the generated 3D models. Recent progress (Magic3D) in text-to-3D has shown that employing high-resolution (e.g., 512 x…

Computer Vision and Pattern Recognition · Computer Science 2023-08-01 Jinbo Wu , Xiaobo Gao , Xing Liu , Zhengyang Shen , Chen Zhao , Haocheng Feng , Jingtuo Liu , Errui Ding

A Unified Approach for Text- and Image-guided 4D Scene Generation

Large-scale diffusion generative models are greatly simplifying image, video and 3D asset creation from user-provided text prompts and images. However, the challenging problem of text-to-4D dynamic 3D scene generation with diffusion…

Computer Vision and Pattern Recognition · Computer Science 2024-05-08 Yufeng Zheng , Xueting Li , Koki Nagano , Sifei Liu , Karsten Kreis , Otmar Hilliges , Shalini De Mello

Grid Diffusion Models for Text-to-Video Generation

Recent advances in the diffusion models have significantly improved text-to-image generation. However, generating videos from text is a more challenging task than generating images from text, due to the much larger dataset and higher…

Computer Vision and Pattern Recognition · Computer Science 2024-12-31 Taegyeong Lee , Soyeong Kwon , Taehwan Kim

DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model

Recent 3D generative models have achieved remarkable performance in synthesizing high resolution photorealistic images with view consistency and detailed 3D shapes, but training them for diverse domains is challenging since it requires…

Computer Vision and Pattern Recognition · Computer Science 2023-04-03 Gwanghyun Kim , Se Young Chun

Text-to-Image Generation via Implicit Visual Guidance and Hypernetwork

We develop an approach for text-to-image generation that embraces additional retrieval images, driven by a combination of implicit visual guidance loss and generative objectives. Unlike most existing text-to-image generation methods which…

Computer Vision and Pattern Recognition · Computer Science 2022-08-19 Xin Yuan , Zhe Lin , Jason Kuen , Jianming Zhang , John Collomosse

PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models

Text- or image-to-3D generators and 3D scanners can now produce 3D assets with high-quality shapes and textures. These assets typically consist of a single, fused representation, like an implicit neural field, a Gaussian mixture, or a mesh,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-31 Minghao Chen , Roman Shapovalov , Iro Laina , Tom Monnier , Jianyuan Wang , David Novotny , Andrea Vedaldi

Text-Guided Texturing by Synchronized Multi-View Diffusion

This paper introduces a novel approach to synthesize texture to dress up a given 3D object, given a text prompt. Based on the pretrained text-to-image (T2I) diffusion model, existing methods usually employ a project-and-inpaint approach, in…

Computer Vision and Pattern Recognition · Computer Science 2025-03-19 Yuxin Liu , Minshan Xie , Hanyuan Liu , Tien-Tsin Wong

Text-to-3D Generation by 2D Editing

Distilling 3D representations from pretrained 2D diffusion models is essential for 3D creative applications across gaming, film, and interior design. Current SDS-based methods are hindered by inefficient information distillation from…

Computer Vision and Pattern Recognition · Computer Science 2025-03-13 Haoran Li , Yuli Tian , Yonghui Wang , Yong Liao , Lin Wang , Yuyang Wang , Peng Yuan Zhou

3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

Text-driven 3D scene generation techniques have made rapid progress in recent years. Their success is mainly attributed to using existing generative models to iteratively perform image warping and inpainting to generate 3D scenes. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-15 Frank Zhang , Yibo Zhang , Quan Zheng , Rui Ma , Wei Hua , Hujun Bao , Weiwei Xu , Changqing Zou

MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification

The field of text-to-3D content generation has made significant progress in generating realistic 3D objects, with existing methodologies like Score Distillation Sampling (SDS) offering promising guidance. However, these methods often…

Computer Vision and Pattern Recognition · Computer Science 2024-09-11 Phu Pham , Aradhya N. Mathur , Ojaswa Sharma , Aniket Bera