English
Related papers

Related papers: Speed3R: Sparse Feed-forward 3D Reconstruction Mod…

200 papers

Feed-forward 3D reconstruction models based on Vision Transformers can directly estimate scene geometry and camera poses from a small set of input images, but scaling them to video inputs with hundreds or thousands of frames remains…

Computer Vision and Pattern Recognition · Computer Science 2026-05-20 Zecheng Tang , Jiaye Fu , Qiankun Gao , Haijie Li , Yanmin Wu , Jiaqi Zhang , Siwei Ma , Jian Zhang

Current multi-view 3D reconstruction methods rely on accurate camera calibration and pose estimation, requiring complex and time-intensive pre-processing that hinders their practical deployment. To address this challenge, we introduce…

Graphics · Computer Science 2025-08-07 Haodong Zhu , Changbai Li , Yangyang Ren , Zichao Feng , Xuhui Liu , Hanlin Chen , Xiantong Zhen , Baochang Zhang

We present AMB3R, a multi-view feed-forward model for dense 3D reconstruction on a metric-scale that addresses diverse 3D vision tasks. The key idea is to leverage a sparse, yet compact, volumetric scene representation as our backend,…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Hengyi Wang , Lourdes Agapito

We present Light3R-SfM, a feed-forward, end-to-end learnable framework for efficient large-scale Structure-from-Motion (SfM) from unconstrained image collections. Unlike existing SfM solutions that rely on costly matching and global…

Computer Vision and Pattern Recognition · Computer Science 2025-01-28 Sven Elflein , Qunjie Zhou , Sérgio Agostinho , Laura Leal-Taixé

Image matching is a key component of modern 3D vision algorithms, essential for accurate scene reconstruction and localization. MASt3R redefines image matching as a 3D task by leveraging DUSt3R and introducing a fast reciprocal matching…

Computer Vision and Pattern Recognition · Computer Science 2025-03-14 Jingxing Li , Yongjae Lee , Abhay Kumar Yadav , Cheng Peng , Rama Chellappa , Deliang Fan

We present Spann3R, a novel approach for dense 3D reconstruction from ordered or unordered image collections. Built on the DUSt3R paradigm, Spann3R uses a transformer-based architecture to directly regress pointmaps from images without any…

Computer Vision and Pattern Recognition · Computer Science 2024-08-30 Hengyi Wang , Lourdes Agapito

We present Edit3r, a feed-forward framework that reconstructs and edits 3D scenes in a single pass from unposed, view-inconsistent, instruction-edited images. Unlike prior methods requiring per-scene optimization, Edit3r directly predicts…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Jiageng Liu , Weijie Lyu , Xueting Li , Yejie Guo , Ming-Hsuan Yang

Transformer-based 3D reconstruction has emerged as a powerful paradigm for recovering geometry and appearance from multi-view observations, offering strong performance across challenging visual conditions. As these models scale to larger…

Computer Vision and Pattern Recognition · Computer Science 2026-05-13 Haoyu Zhang , Zeyu Zhang , Zedong Zhou , Yang Zhao , Hao Tang

Realtime 4D reconstruction for dynamic scenes remains a crucial challenge for autonomous driving perception. Most existing methods rely on depth estimation through self-supervision or multi-modality sensor fusion. In this paper, we propose…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Xin Fei , Wenzhao Zheng , Yueqi Duan , Wei Zhan , Masayoshi Tomizuka , Kurt Keutzer , Jiwen Lu

Dense 3D scene reconstruction from an ordered sequence or unordered image collections is a critical step when bringing research in computer vision into practical scenarios. Following the paradigm introduced by DUSt3R, which unifies an image…

Computer Vision and Pattern Recognition · Computer Science 2025-12-01 Yuqi Wu , Wenzhao Zheng , Jie Zhou , Jiwen Lu

3D super-resolution (3DSR) aims to reconstruct high-resolution (HR) 3D scenes from low-resolution (LR) multi-view images. Existing methods rely on dense LR inputs and per-scene optimization, which restricts the high-frequency priors for…

Computer Vision and Pattern Recognition · Computer Science 2026-03-02 Xiang Feng , Xiangbo Wang , Tieshi Zhong , Chengkai Wang , Yiting Zhao , Tianxiang Xu , Zhenzhong Kuang , Feiwei Qin , Xuefei Yin , Yanming Zhu

To advance the state of the art in the creation of 3D foundation models, this paper introduces the ConDense framework for 3D pre-training utilizing existing pre-trained 2D networks and large-scale multi-view datasets. We propose a novel…

Computer Vision and Pattern Recognition · Computer Science 2024-09-02 Xiaoshuai Zhang , Zhicheng Wang , Howard Zhou , Soham Ghosh , Danushen Gnanapragasam , Varun Jampani , Hao Su , Leonidas Guibas

Dense matching methods like DUSt3R regress pairwise pointmaps for 3D reconstruction. However, the reliance on pairwise prediction and the limited generalization capability inherently restrict the global geometric consistency. In this work,…

Computer Vision and Pattern Recognition · Computer Science 2025-06-17 Yuheng Yuan , Qiuhong Shen , Shizun Wang , Xingyi Yang , Xinchao Wang

Current methods for dense 3D point tracking in dynamic scenes typically rely on pairwise processing, require known camera poses, or assume temporal ordering of input frames, thereby constraining their flexibility and applicability.…

Computer Vision and Pattern Recognition · Computer Science 2026-04-06 Vivek Alumootil , Tuan-Anh Vu

We present Dense-SfM, a novel Structure from Motion (SfM) framework designed for dense and accurate 3D reconstruction from multi-view images. Sparse keypoint matching, which traditional SfM methods often rely on, limits both accuracy and…

Computer Vision and Pattern Recognition · Computer Science 2026-01-29 JongMin Lee , Sungjoo Yoo

We present Fin3R, a simple, effective, and general fine-tuning method for feed-forward 3D reconstruction models. The family of feed-forward reconstruction model regresses pointmap of all input images to a reference frame coordinate system,…

Computer Vision and Pattern Recognition · Computer Science 2025-12-01 Weining Ren , Hongjun Wang , Xiao Tan , Kai Han

We present PreF3R, Pose-Free Feed-forward 3D Reconstruction from an image sequence of variable length. Unlike previous approaches, PreF3R removes the need for camera calibration and reconstructs the 3D Gaussian field within a canonical…

Computer Vision and Pattern Recognition · Computer Science 2024-11-27 Zequn Chen , Jiezhi Yang , Heng Yang

Recent feed-forward geometry foundation models have demonstrated impressive generalization by recovering depth and poses in a single forward pass. However, these models are typically constrained by a global coordinate frame assumption. This…

Computer Vision and Pattern Recognition · Computer Science 2026-05-27 Congrong Xu , Huachen Gao , Xingyu Chen , Yuliang Xiu , Jun Gao , Anpei Chen

We study the problem of single-image 3D object reconstruction. Recent works have diverged into two directions: regression-based modeling and generative modeling. Regression methods efficiently infer visible surfaces, but struggle with…

Computer Vision and Pattern Recognition · Computer Science 2025-01-09 Zixuan Huang , Mark Boss , Aaryaman Vasishta , James M. Rehg , Varun Jampani

Multi-view 3D reconstruction remains a core challenge in computer vision, particularly in applications requiring accurate and scalable representations across diverse perspectives. Current leading methods such as DUSt3R employ a…

Computer Vision and Pattern Recognition · Computer Science 2025-03-21 Jianing Yang , Alexander Sax , Kevin J. Liang , Mikael Henaff , Hao Tang , Ang Cao , Joyce Chai , Franziska Meier , Matt Feiszli
‹ Prev 1 2 3 10 Next ›