English
Related papers

Related papers: Can Video Diffusion Model Reconstruct 4D Geometry?

200 papers

Monocular dynamic reconstruction is a challenging and long-standing vision problem due to the highly ill-posed nature of the task. Existing approaches depend on templates, are effective only in quasi-static scenes, or fail to model 3D…

Computer Vision and Pattern Recognition · Computer Science 2025-10-17 Qianqian Wang , Vickie Ye , Hang Gao , Weijia Zeng , Jake Austin , Zhengqi Li , Angjoo Kanazawa

Novel view synthesis from monocular videos of dynamic scenes with unknown camera poses remains a fundamental challenge in computer vision and graphics. While recent advances in 3D representations such as Neural Radiance Fields (NeRF) and 3D…

Computer Vision and Pattern Recognition · Computer Science 2025-11-10 Mengqi Guo , Bo Xu , Yanyan Li , Gim Hee Lee

We introduce Geo4D, a method to repurpose video diffusion models for monocular 3D reconstruction of dynamic scenes. By leveraging the strong dynamic priors captured by large-scale pre-trained video models, Geo4D can be trained using only…

Computer Vision and Pattern Recognition · Computer Science 2025-08-20 Zeren Jiang , Chuanxia Zheng , Iro Laina , Diane Larlus , Andrea Vedaldi

In this paper, we propose VideoFrom3D, a novel framework for synthesizing high-quality 3D scene videos from coarse geometry, a camera trajectory, and a reference image. Our approach streamlines the 3D graphic design workflow, enabling…

Graphics · Computer Science 2025-09-23 Geonung Kim , Janghyeok Han , Sunghyun Cho

Advancements in 3D scene reconstruction have transformed 2D images from the real world into 3D models, producing realistic 3D results from hundreds of input photos. Despite great success in dense-view reconstruction scenarios, rendering a…

Computer Vision and Pattern Recognition · Computer Science 2025-06-26 Fangfu Liu , Wenqiang Sun , Hanyang Wang , Yikai Wang , Haowen Sun , Junliang Ye , Jun Zhang , Yueqi Duan

Video stabilization aims to mitigate camera shake but faces a fundamental trade-off between geometric robustness and full-frame consistency. While 2D methods suffer from aggressive cropping, 3D techniques are often undermined by fragile…

Computer Vision and Pattern Recognition · Computer Science 2026-03-09 Muhua Zhu , Xinhao Jin , Yu Zhang , Yifei Xue , Tie Ji , Yizhen Lao

Real-world applications like video gaming and virtual reality often demand the ability to model 3D scenes that users can explore along custom camera trajectories. While significant progress has been made in generating 3D objects from text…

Computer Vision and Pattern Recognition · Computer Science 2025-06-05 Tianyu Huang , Wangguandong Zheng , Tengfei Wang , Yuhao Liu , Zhenwei Wang , Junta Wu , Jie Jiang , Hui Li , Rynson W. H. Lau , Wangmeng Zuo , Chunchao Guo

Realtime 4D reconstruction for dynamic scenes remains a crucial challenge for autonomous driving perception. Most existing methods rely on depth estimation through self-supervision or multi-modality sensor fusion. In this paper, we propose…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Xin Fei , Wenzhao Zheng , Yueqi Duan , Wei Zhan , Masayoshi Tomizuka , Kurt Keutzer , Jiwen Lu

Online monocular 3D reconstruction enables dense scene recovery from streaming video but remains fundamentally limited by the stability-adaptation dilemma: the reconstruction model must rapidly incorporate novel viewpoints while preserving…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Lanbo Xu , Liang Guo , Caigui Jiang , Cheng Wang

Generating interactive and dynamic 4D scenes from a single static image remains a core challenge. Most existing generate-then-reconstruct and reconstruct-then-generate methods decouple geometry from motion, causing spatiotemporal…

Computer Vision and Pattern Recognition · Computer Science 2025-12-05 Yanran Zhang , Ziyi Wang , Wenzhao Zheng , Zheng Zhu , Jie Zhou , Jiwen Lu

The spatio-temporal complexity of video data presents significant challenges in tasks such as compression, generation, and inpainting. We present four key contributions to address the challenges of spatiotemporal video processing. First, we…

Computer Vision and Pattern Recognition · Computer Science 2025-03-12 Onkar Susladkar , Jishu Sen Gupta , Chirag Sehgal , Sparsh Mittal , Rekha Singhal

The recently developed Sora model [1] has exhibited remarkable capabilities in video generation, sparking intense discussions regarding its ability to simulate real-world phenomena. Despite its growing popularity, there is a lack of…

Computer Vision and Pattern Recognition · Computer Science 2024-02-28 Xuanyi Li , Daquan Zhou , Chenxu Zhang , Shaodong Wei , Qibin Hou , Ming-Ming Cheng

Reconstructing dynamic 4D scenes remains challenging due to the presence of moving objects that corrupt camera pose estimation. Existing optimization methods alleviate this issue with additional supervision, but they are mostly…

Computer Vision and Pattern Recognition · Computer Science 2026-03-09 Juntong Fang , Zequn Chen , Weiqi Zhang , Donglin Di , Xuancheng Zhang , Chengmin Yang , Yu-Shen Liu

We present NOVA3R, an effective approach for non-pixel-aligned 3D reconstruction from a set of unposed images in a feed-forward manner. Unlike pixel-aligned methods that tie geometry to per-ray predictions, our formulation learns a global,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Weirong Chen , Chuanxia Zheng , Ganlin Zhang , Andrea Vedaldi , Daniel Cremers

Estimating geometry from dynamic scenes, where objects move and deform over time, remains a core challenge in computer vision. Current approaches often rely on multi-stage pipelines or global optimizations that decompose the problem into…

Computer Vision and Pattern Recognition · Computer Science 2025-05-09 Junyi Zhang , Charles Herrmann , Junhwa Hur , Varun Jampani , Trevor Darrell , Forrester Cole , Deqing Sun , Ming-Hsuan Yang

We present PAD3R, a method for reconstructing deformable 3D objects from casually captured, unposed monocular videos. Unlike existing approaches, PAD3R handles long video sequences featuring substantial object deformation, large-scale…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Ting-Hsuan Liao , Haowen Liu , Yiran Xu , Songwei Ge , Gengshan Yang , Jia-Bin Huang

Dynamic Novel View Synthesis aims to generate photorealistic views of moving subjects from arbitrary viewpoints. This task is particularly challenging when relying on monocular video, where disentangling structure from motion is ill-posed…

Computer Vision and Pattern Recognition · Computer Science 2025-06-24 Michal Nazarczuk , Sibi Catley-Chandar , Thomas Tanay , Zhensong Zhang , Gregory Slabaugh , Eduardo Pérez-Pellitero

Reconstructing and tracking dynamic 3D scenes remains a fundamental challenge in computer vision. Existing approaches often decouple geometry from motion: multi-view reconstruction methods assume static scenes, while dynamic tracking…

Computer Vision and Pattern Recognition · Computer Science 2026-02-17 Shenhan Qian , Ganlin Zhang , Shangzhe Wu , Daniel Cremers

Recent advances in diffusion-based video generation have opened new possibilities for controllable video editing, yet realistic video object insertion (VOI) remains challenging due to limited 4D scene understanding and inadequate handling…

Computer Vision and Pattern Recognition · Computer Science 2025-12-22 Hoiyeong Jin , Hyojin Jang , Jeongho Kim , Junha Hyung , Kinam Kim , Dongjin Kim , Huijin Choi , Hyeonji Kim , Jaegul Choo

Camera redirection aims to replay a dynamic scene from a single monocular video under a user-specified camera trajectory. However, large-angle redirection is inherently ill-posed: a monocular video captures only a narrow spatio-temporal…

Computer Vision and Pattern Recognition · Computer Science 2026-05-20 Wei Cao , Hao Zhang , Fengrui Tian , Yulun Wu , Yingying Li , Shenlong Wang , Ning Yu , Yaoyao Liu
‹ Prev 1 2 3 10 Next ›