Related papers: Continuous 3D Perception Model with Persistent Sta…

Mem3R: Streaming 3D Reconstruction with Hybrid Memory via Test-Time Training

Streaming 3D perception is well suited to robotics and augmented reality, where long visual streams must be processed efficiently and consistently. Recent recurrent models offer a promising solution by maintaining fixed-size states and…

Computer Vision and Pattern Recognition · Computer Science 2026-04-09 Changkun Liu , Jiezhi Yang , Zeman Li , Yuan Deng , Jiancong Guo , Luca Ballan

Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory

Dense 3D scene reconstruction from an ordered sequence or unordered image collections is a critical step when bringing research in computer vision into practical scenarios. Following the paradigm introduced by DUSt3R, which unifies an image…

Computer Vision and Pattern Recognition · Computer Science 2025-12-01 Yuqi Wu , Wenzhao Zheng , Jie Zhou , Jiwen Lu

STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

We present STream3R, a novel approach to 3D reconstruction that reformulates pointmap prediction as a decoder-only Transformer problem. Existing state-of-the-art methods for multi-view reconstruction either depend on expensive global…

Computer Vision and Pattern Recognition · Computer Science 2025-08-15 Yushi Lan , Yihang Luo , Fangzhou Hong , Shangchen Zhou , Honghua Chen , Zhaoyang Lyu , Shuai Yang , Bo Dai , Chen Change Loy , Xingang Pan

SurgCUT3R: Surgical Scene-Aware Continuous Understanding of Temporal 3D Representation

Reconstructing surgical scenes from monocular endoscopic video is critical for advancing robotic-assisted surgery. However, the application of state-of-the-art general-purpose reconstruction models is constrained by two key challenges: the…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Kaiyuan Xu , Fangzhou Hong , Daniel Elson , Baoru Huang

MUT3R: Motion-aware Updating Transformer for Dynamic 3D Reconstruction

Recent stateful recurrent neural networks have achieved remarkable progress on static 3D reconstruction but remain vulnerable to motion-induced artifacts, where non-rigid regions corrupt attention propagation between the spatial memory and…

Computer Vision and Pattern Recognition · Computer Science 2025-12-04 Guole Shen , Tianchen Deng , Xingrui Qin , Nailin Wang , Jianyu Wang , Yanbo Wang , Yongtao Chen , Hesheng Wang , Jingchuan Wang

3D Scene Change Modeling With Consistent Multi-View Aggregation

Change detection plays a vital role in scene monitoring, exploration, and continual reconstruction. Existing 3D change detection methods often exhibit spatial inconsistency in the detected changes and fail to explicitly separate pre- and…

Computer Vision and Pattern Recognition · Computer Science 2025-12-30 Zirui Zhou , Junfeng Ni , Shujie Zhang , Yixin Chen , Siyuan Huang

FILT3R: Latent State Adaptive Kalman Filter for Streaming 3D Reconstruction

Streaming 3D reconstruction maintains a persistent latent state that is updated online from incoming frames, enabling constant-memory inference. A key failure mode is the state update rule: aggressive overwrites forget useful history, while…

Computer Vision and Pattern Recognition · Computer Science 2026-03-20 Seonghyun Jin , Jong Chul Ye

TTSA3R: Training-Free Temporal-Spatial Adaptive Persistent State for Streaming 3D Reconstruction

Streaming recurrent models enable efficient 3D reconstruction by maintaining persistent state representations. However, they suffer from catastrophic forgetting over long sequences due to balancing historical information with new…

Computer Vision and Pattern Recognition · Computer Science 2026-02-18 Zhijie Zheng , Xinhao Xiang , Jiawei Zhang

DePT3R: Joint Dense Point Tracking and 3D Reconstruction of Dynamic Scenes in a Single Forward Pass

Current methods for dense 3D point tracking in dynamic scenes typically rely on pairwise processing, require known camera poses, or assume temporal ordering of input frames, thereby constraining their flexibility and applicability.…

Computer Vision and Pattern Recognition · Computer Science 2026-04-06 Vivek Alumootil , Tuan-Anh Vu

G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration

We introduce G-CUT3R, a novel feed-forward approach for guided 3D scene reconstruction that enhances the CUT3R model by integrating prior information. Unlike existing feed-forward methods that rely solely on input images, our method…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Ramil Khafizov , Artem Komarichev , Ruslan Rakhimov , Peter Wonka , Evgeny Burnaev

LONG3R: Long Sequence Streaming 3D Reconstruction

Recent advancements in multi-view scene reconstruction have been significant, yet existing methods face limitations when processing streams of input images. These methods either rely on time-consuming offline optimization or are restricted…

Computer Vision and Pattern Recognition · Computer Science 2025-07-25 Zhuoguang Chen , Minghui Qin , Tianyuan Yuan , Zhe Liu , Hang Zhao

Ray-Aware Pointer Memory with Adaptive Updates for Streaming 3D Reconstruction

Dense 3D reconstruction from continuous image streams requires both accurate geometric aggregation and stable long-term memory management. Recent feed-forward reconstruction frameworks integrate observations through persistent memory…

Computer Vision and Pattern Recognition · Computer Science 2026-05-22 Feifei Li , Qi Song , Chi Zhang , Rui Huang

Dynamic Point Maps: A Versatile Representation for Dynamic 3D Reconstruction

DUSt3R has recently shown that one can reduce many tasks in multi-view geometry, including estimating camera intrinsics and extrinsics, reconstructing the scene in 3D, and establishing image correspondences, to the prediction of a pair of…

Computer Vision and Pattern Recognition · Computer Science 2025-03-21 Edgar Sucar , Zihang Lai , Eldar Insafutdinov , Andrea Vedaldi

Persistent Nature: A Generative Model of Unbounded 3D Worlds

Despite increasingly realistic image quality, recent 3D image generative models often operate on 3D volumes of fixed extent with limited camera motions. We investigate the task of unconditionally synthesizing unbounded nature scenes,…

Computer Vision and Pattern Recognition · Computer Science 2023-03-24 Lucy Chai , Richard Tucker , Zhengqi Li , Phillip Isola , Noah Snavely

Uni3R: Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images

Reconstructing and semantically interpreting 3D scenes from sparse 2D views remains a fundamental challenge in computer vision. Conventional methods often decouple semantic understanding from reconstruction or necessitate costly per-scene…

Computer Vision and Pattern Recognition · Computer Science 2026-03-25 Xiangyu Sun , Haoyi Jiang , Liu Liu , Seungtae Nam , Gyeongjin Kang , Xinjie Wang , Wei Sui , Zhizhong Su , Wenyu Liu , Xinggang Wang , Eunbyung Park

Robo3R: Enhancing Robotic Manipulation with Accurate Feed-Forward 3D Reconstruction

3D spatial perception is fundamental to generalizable robotic manipulation, yet obtaining reliable, high-quality 3D geometry remains challenging. Depth sensors suffer from noise and material sensitivity, while existing reconstruction models…

Robotics · Computer Science 2026-05-05 Sizhe Yang , Linning Xu , Hao Li , Juncheng Mu , Jia Zeng , Dahua Lin , Jiangmiao Pang

GRS-SLAM3R: Real-Time Dense SLAM with Gated Recurrent State

DUSt3R-based end-to-end scene reconstruction has recently shown promising results in dense visual SLAM. However, most existing methods only use image pairs to estimate pointmaps, overlooking spatial memory and global consistency.To this…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Guole Shen , Tianchen Deng , Yanbo Wang , Yongtao Chen , Yilin Shen , Jiuming Liu , Jingchuan Wang

PointRecon: Online Point-based 3D Reconstruction via Ray-based 2D-3D Matching

We propose a novel online, point-based 3D reconstruction method from posed monocular RGB videos. Our model maintains a global point cloud representation of the scene, continuously updating the features and 3D locations of points as new…

Computer Vision and Pattern Recognition · Computer Science 2024-11-25 Chen Ziwen , Zexiang Xu , Li Fuxin

PAS3R: Pose-Adaptive Streaming 3D Reconstruction for Long Video Sequences

Online monocular 3D reconstruction enables dense scene recovery from streaming video but remains fundamentally limited by the stability-adaptation dilemma: the reconstruction model must rapidly incorporate novel viewpoints while preserving…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Lanbo Xu , Liang Guo , Caigui Jiang , Cheng Wang

Edit3r: Instant 3D Scene Editing from Sparse Unposed Images

We present Edit3r, a feed-forward framework that reconstructs and edits 3D scenes in a single pass from unposed, view-inconsistent, instruction-edited images. Unlike prior methods requiring per-scene optimization, Edit3r directly predicts…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Jiageng Liu , Weijie Lyu , Xueting Li , Yejie Guo , Ming-Hsuan Yang