Related papers: Zero123++: a Single Image to Consistent Multi-view…

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views

Synthesizing multi-view 3D from one single image is a significant but challenging task. Zero-1-to-3 methods have achieved great success by lifting a 2D latent diffusion model to the 3D scope. The target view image is generated with a…

Computer Vision and Pattern Recognition · Computer Science 2024-08-09 Yabo Chen , Jiemin Fang , Yuyang Huang , Taoran Yi , Xiaopeng Zhang , Lingxi Xie , Xinggang Wang , Wenrui Dai , Hongkai Xiong , Qi Tian

Consistent123: Improve Consistency for One Image to 3D Object Synthesis

Large image diffusion models enable novel view synthesis with high quality and excellent zero-shot capability. However, such models based on image-to-image translation have no guarantee of view consistency, limiting the performance for…

Computer Vision and Pattern Recognition · Computer Science 2023-10-13 Haohan Weng , Tianyu Yang , Jianan Wang , Yu Li , Tong Zhang , C. L. Philip Chen , Lei Zhang

ConsistNet: Enforcing 3D Consistency for Multi-view Images Diffusion

Given a single image of a 3D object, this paper proposes a novel method (named ConsistNet) that is able to generate multiple images of the same object, as if seen they are captured from different viewpoints, while the 3D (multi-view)…

Computer Vision and Pattern Recognition · Computer Science 2023-10-17 Jiayu Yang , Ziang Cheng , Yunfei Duan , Pan Ji , Hongdong Li

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

In this paper, we present a novel diffusion model called that generates multiview-consistent images from a single-view image. Using pretrained large-scale 2D diffusion models, recent work Zero123 demonstrates the ability to generate…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Yuan Liu , Cheng Lin , Zijiao Zeng , Xiaoxiao Long , Lingjie Liu , Taku Komura , Wenping Wang

Zero-1-to-3: Zero-shot One Image to 3D Object

We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an object given just a single RGB image. To perform novel view synthesis in this under-constrained setting, we capitalize on the geometric priors that large-scale…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Ruoshi Liu , Rundi Wu , Basile Van Hoorick , Pavel Tokmakov , Sergey Zakharov , Carl Vondrick

DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis

We present DiffPortrait3D, a conditional diffusion model that is capable of synthesizing 3D-consistent photo-realistic novel views from as few as a single in-the-wild portrait. Specifically, given a single RGB input, we aim to synthesize…

Computer Vision and Pattern Recognition · Computer Science 2025-03-21 Yuming Gu , You Xie , Hongyi Xu , Guoxian Song , Yichun Shi , Di Chang , Jing Yang , Linjie Luo

Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors

Reconstructing 3D objects from a single image guided by pretrained diffusion models has demonstrated promising outcomes. However, due to utilizing the case-agnostic rigid strategy, their generalization ability to arbitrary cases and the 3D…

Computer Vision and Pattern Recognition · Computer Science 2024-02-21 Yukang Lin , Haonan Han , Chaoqun Gong , Zunnan Xu , Yachao Zhang , Xiu Li

Consistent-1-to-3: Consistent Image to 3D View Synthesis via Geometry-aware Diffusion Models

Zero-shot novel view synthesis (NVS) from a single image is an essential problem in 3D object understanding. While recent approaches that leverage pre-trained generative models can synthesize high-quality novel views from in-the-wild…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Jianglong Ye , Peng Wang , Kejie Li , Yichun Shi , Heng Wang

Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE

As Artificial Intelligence Generated Content (AIGC) advances, a variety of methods have been developed to generate text, images, videos, and 3D objects from single or multimodal inputs, contributing efforts to emulate human-like cognitive…

Computer Vision and Pattern Recognition · Computer Science 2024-08-21 Yiying Yang , Fukun Yin , Jiayuan Fan , Xin Chen , Wanzhang Li , Gang Yu

3D-aware Image Generation using 2D Diffusion Models

In this paper, we introduce a novel 3D-aware image generation method that leverages 2D diffusion models. We formulate the 3D-aware image generation task as multiview 2D image set generation, and further to a sequential…

Computer Vision and Pattern Recognition · Computer Science 2023-04-03 Jianfeng Xiang , Jiaolong Yang , Binbin Huang , Xin Tong

MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View

Generating consistent multiple views for 3D reconstruction tasks is still a challenge to existing image-to-3D diffusion models. Generally, incorporating 3D representations into diffusion model decrease the model's speed as well as…

Computer Vision and Pattern Recognition · Computer Science 2024-06-14 Emmanuelle Bourigault , Pauline Bourigault

Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting

Recent one image to 3D generation methods commonly adopt Score Distillation Sampling (SDS). Despite the impressive results, there are multiple deficiencies including multi-view inconsistency, over-saturated and over-smoothed textures, as…

Computer Vision and Pattern Recognition · Computer Science 2023-12-29 Junwu Zhang , Zhenyu Tang , Yatian Pang , Xinhua Cheng , Peng Jin , Yida Wei , Munan Ning , Li Yuan

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

Achieving machine autonomy and human control often represent divergent objectives in the design of interactive AI systems. Visual generative foundation models such as Stable Diffusion show promise in navigating these goals, especially when…

Computer Vision and Pattern Recognition · Computer Science 2023-11-03 Can Qin , Shu Zhang , Ning Yu , Yihao Feng , Xinyi Yang , Yingbo Zhou , Huan Wang , Juan Carlos Niebles , Caiming Xiong , Silvio Savarese , Stefano Ermon , Yun Fu , Ran Xu

ViewMask-1-to-3: Multi-View Consistent Image Generation via Multimodal Diffusion Models

Motivated by discrete diffusion's success in language-vision modeling, we explore its potential for multi-view generation, a task dominated by continuous approaches. We introduce ViewMask-1-to-3, formulating multi-view synthesis as a…

Computer Vision and Pattern Recognition · Computer Science 2026-03-16 Ruishu Zhu , Zhihao Huang , Jiacheng Sun , Ping Luo , Hongyuan Zhang , Xuelong Li

Ctrl123: Consistent Novel View Synthesis via Closed-Loop Transcription

Large image diffusion models have demonstrated zero-shot capability in novel view synthesis (NVS). However, existing diffusion-based NVS methods struggle to generate novel views that are accurately consistent with the corresponding ground…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Hongxiang Zhao , Xili Dai , Jianan Wang , Shengbang Tong , Jingyuan Zhang , Weida Wang , Lei Zhang , Yi Ma

ConTEXTure: Consistent Multiview Images to Texture

We introduce ConTEXTure, a generative network designed to create a texture map/atlas for a given 3D mesh using images from multiple viewpoints. The process begins with generating a front-view image from a text prompt, such as 'Napoleon,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Jaehoon Ahn , Sumin Cho , Harim Jung , Kibeom Hong , Seonghoon Ban , Moon-Ryul Jung

ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image

We introduce a 3D-aware diffusion model, ZeroNVS, for single-image novel view synthesis for in-the-wild scenes. While existing methods are designed for single objects with masked backgrounds, we propose new techniques to address challenges…

Computer Vision and Pattern Recognition · Computer Science 2024-04-25 Kyle Sargent , Zizhang Li , Tanmay Shah , Charles Herrmann , Hong-Xing Yu , Yunzhi Zhang , Eric Ryan Chan , Dmitry Lagun , Li Fei-Fei , Deqing Sun , Jiajun Wu

Landscape-Awareness for Geometric View Diffusion Model

Accurate camera viewpoint estimation under sparse-view conditions remains challenging, particularly in two-view scenarios. Recent approaches leverage diffusion models such as Zero123 to synthesize novel views conditioned on relative…

Computer Vision and Pattern Recognition · Computer Science 2026-05-20 Yan-Ting Chen , Hao-Wei Chen , Tsu-Ching Hsiao , Chun-Yi Lee

One Diffusion to Generate Them All

We introduce OneDiffusion, a versatile, large-scale diffusion model that seamlessly supports bidirectional image synthesis and understanding across diverse tasks. It enables conditional generation from inputs such as text, depth, pose,…

Computer Vision and Pattern Recognition · Computer Science 2025-06-16 Duong H. Le , Tuan Pham , Sangho Lee , Christopher Clark , Aniruddha Kembhavi , Stephan Mandt , Ranjay Krishna , Jiasen Lu

VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model

Generating multi-view images based on text or single-image prompts is a critical capability for the creation of 3D content. Two fundamental questions on this topic are what data we use for training and how to ensure multi-view consistency.…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Qi Zuo , Xiaodong Gu , Lingteng Qiu , Yuan Dong , Zhengyi Zhao , Weihao Yuan , Rui Peng , Siyu Zhu , Zilong Dong , Liefeng Bo , Qixing Huang