English
Related papers

Related papers: Stable Virtual Camera: Generative View Synthesis w…

200 papers

Novel view synthesis from a single image has been a cornerstone problem for many Virtual Reality applications that provide immersive experiences. However, most existing techniques can only synthesize novel views within a limited range of…

Computer Vision and Pattern Recognition · Computer Science 2023-03-31 Hung-Yu Tseng , Qinbo Li , Changil Kim , Suhib Alsisan , Jia-Bin Huang , Johannes Kopf

This paper explores the innovative application of Stable Video Diffusion (SVD), a diffusion model that revolutionizes the creation of dynamic video content from static images. As digital media and design industries accelerate, SVD emerges…

Human-Computer Interaction · Computer Science 2024-05-24 Elijah Miller , Thomas Dupont , Mingming Wang

Generating novel views of an object from a single image is a challenging task. It requires an understanding of the underlying 3D structure of the object from an image and rendering high-quality, spatially consistent new views. While recent…

Computer Vision and Pattern Recognition · Computer Science 2023-12-05 Jeong-gi Kwak , Erqun Dong , Yuhe Jin , Hanseok Ko , Shweta Mahajan , Kwang Moo Yi

We present Stable View Synthesis (SVS). Given a set of source images depicting a scene from freely distributed viewpoints, SVS synthesizes new views of the scene. The method operates on a geometric scaffold computed via…

Computer Vision and Pattern Recognition · Computer Science 2021-05-04 Gernot Riegler , Vladlen Koltun

We present Stable Video 3D (SV3D) -- a latent video diffusion model for high-resolution, image-to-multi-view generation of orbital videos around a 3D object. Recent work on 3D generation propose techniques to adapt 2D generative models for…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Vikram Voleti , Chun-Han Yao , Mark Boss , Adam Letts , David Pankratz , Dmitry Tochilkin , Christian Laforte , Robin Rombach , Varun Jampani

Generating high-quality novel views of a scene from a single image requires maintaining structural coherence across different views, referred to as view consistency. While diffusion models have driven advancements in novel view synthesis,…

Computer Vision and Pattern Recognition · Computer Science 2025-08-07 Jiwoo Park , Tae Eun Choi , Youngjun Jun , Seong Jae Hwang

We present Stable Video 4D (SV4D), a latent video diffusion model for multi-frame and multi-view consistent dynamic 3D content generation. Unlike previous methods that rely on separately trained generative models for video generation and…

Computer Vision and Pattern Recognition · Computer Science 2025-03-03 Yiming Xie , Chun-Han Yao , Vikram Voleti , Huaizu Jiang , Varun Jampani

Novel view synthesis (NVS) from a single image is highly ill-posed due to large unobserved regions, especially for views that deviate significantly from the input. While existing methods focus on consistency between the source and generated…

Computer Vision and Pattern Recognition · Computer Science 2025-09-03 Xueyang Kang , Zhengkang Xiang , Zezheng Zhang , Kourosh Khoshelham

Synthesizing a novel view from a single input image is a challenging task. Traditionally, this task was approached by estimating scene depth, warping, and inpainting, with machine learning models enabling parts of the pipeline. More…

Computer Vision and Pattern Recognition · Computer Science 2024-11-13 Noam Elata , Bahjat Kawar , Yaron Ostrovsky-Berman , Miriam Farber , Ron Sokolovsky

Novel view synthesis from a single input image is a challenging task, where the goal is to generate a new view of a scene from a desired camera pose that may be separated by a large motion. The highly uncertain nature of this synthesis task…

Computer Vision and Pattern Recognition · Computer Science 2023-08-23 Jason J. Yu , Fereshteh Forghani , Konstantinos G. Derpanis , Marcus A. Brubaker

Modern video generative models based on diffusion models can produce very realistic clips, but they are computationally inefficient, often requiring minutes of GPU time for just a few seconds of video. This inefficiency poses a critical…

Computer Vision and Pattern Recognition · Computer Science 2026-01-15 Jieying Chen , Jeffrey Hu , Joan Lasenby , Ayush Tewari

Novel-view synthesis techniques achieve impressive results for static scenes but struggle when faced with the inconsistencies inherent to casual capture settings: varying illumination, scene motion, and other unintended effects that are…

Diffusion-based methods can generate realistic images and videos, but they struggle to edit existing objects in a video while preserving their appearance over time. This prevents diffusion models from being applied to natural video editing…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Wenhao Chai , Xun Guo , Gaoang Wang , Yan Lu

Autoregressive video diffusion models are capable of long rollouts that are stable and consistent with history, but they are unable to guide the current generation with conditioning from the future. In camera-guided video generation with a…

Computer Vision and Pattern Recognition · Computer Science 2026-04-13 Chonghyuk Song , Michal Stary , Boyuan Chen , George Kopanas , Vincent Sitzmann

Synthesizing novel views from monocular videos of dynamic scenes remains a challenging problem. Scene-specific methods that optimize 4D representations with explicit motion priors often break down in highly dynamic regions where multi-view…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Thomas Tanay , Mohammed Brahimi , Michal Nazarczuk , Qingwen Zhang , Sibi Catley-Chandar , Arthur Moreau , Zhensong Zhang , Eduardo Pérez-Pellitero

Collecting multi-view driving scenario videos to enhance the performance of 3D visual perception tasks presents significant challenges and incurs substantial costs, making generative models for realistic data an appealing alternative. Yet,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-29 Junpeng Jiang , Gangyi Hong , Miao Zhang , Hengtong Hu , Kun Zhan , Rui Shao , Liqiang Nie

We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image. Our model samples from the distribution of possible renderings consistent with the input and, even in the presence of…

Computer Vision and Pattern Recognition · Computer Science 2023-04-06 Eric R. Chan , Koki Nagano , Matthew A. Chan , Alexander W. Bergman , Jeong Joon Park , Axel Levy , Miika Aittala , Shalini De Mello , Tero Karras , Gordon Wetzstein

We present BetterScene, an approach to enhance novel view synthesis (NVS) quality for diverse real-world scenes using extremely sparse, unconstrained photos. BetterScene leverages the production-ready Stable Video Diffusion (SVD) model…

Computer Vision and Pattern Recognition · Computer Science 2026-02-27 Yuci Han , Charles Toth , John E. Anderson , William J. Shuart , Alper Yilmaz

We present Stable Video Diffusion - a latent video diffusion model for high-resolution, state-of-the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained for 2D image synthesis have been turned into…

Computer Vision and Pattern Recognition · Computer Science 2023-11-28 Andreas Blattmann , Tim Dockhorn , Sumith Kulal , Daniel Mendelevitch , Maciej Kilian , Dominik Lorenz , Yam Levi , Zion English , Vikram Voleti , Adam Letts , Varun Jampani , Robin Rombach

Prior approaches injecting camera control into diffusion models have focused on specific subsets of 4D consistency tasks: novel view synthesis, text-to-video with camera control, image-to-video, amongst others. Therefore, these fragmented…

Computer Vision and Pattern Recognition · Computer Science 2026-01-26 Xiang Fan , Sharath Girish , Vivek Ramanujan , Chaoyang Wang , Ashkan Mirzaei , Petr Sushko , Aliaksandr Siarohin , Sergey Tulyakov , Ranjay Krishna
‹ Prev 1 2 3 10 Next ›