English
Related papers

Related papers: Multi-View Foundation Models

200 papers

Human decision-making often relies on visual information from multiple perspectives or views. In contrast, machine learning-based object recognition utilizes information from a single image of the object. However, the information conveyed…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Mona Alzahrani , Muhammad Usman , Salma Kammoun , Saeed Anwar , Tarek Helmy

The 3D point cloud representation plays a crucial role in preserving the geometric fidelity of the physical world, enabling more accurate complex 3D environments. While humans naturally comprehend the intricate relationships between objects…

Computer Vision and Pattern Recognition · Computer Science 2025-01-31 Vishal Thengane , Xiatian Zhu , Salim Bouzerdoum , Son Lam Phung , Yunpeng Li

The lifting of 3D structure and camera from 2D landmarks is at the cornerstone of the entire discipline of computer vision. Traditional methods have been confined to specific rigid objects, such as those in Perspective-n-Point (PnP)…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Mosam Dabhi , Laszlo A. Jeni , Simon Lucey

Benchmarking 3D spatial understanding of foundation models is essential for real-world applications such as robotics and autonomous driving. Existing evaluations often rely on downstream fine-tuning with linear heads or task-specific…

Computer Vision and Pattern Recognition · Computer Science 2026-01-19 Valentina Lilova , Toyesh Chakravorty , Julian I. Bibo , Emma Boccaletti , Brandon Li , Lívia Baxová , Cees G. M. Snoek , Mohammadreza Salehi

Understanding the mechanisms underlying deep neural networks remains a fundamental challenge in machine learning and computer vision. One promising, yet only preliminarily explored approach, is feature inversion, which attempts to…

Computer Vision and Pattern Recognition · Computer Science 2025-08-15 Jan Rathjens , Shirin Reyhanian , David Kappel , Laurenz Wiskott

Vision foundation models (VFMs) trained on large-scale image datasets provide high-quality features that have significantly advanced 2D visual recognition. However, their potential in 3D scene segmentation remains largely untapped, despite…

Computer Vision and Pattern Recognition · Computer Science 2026-01-23 Karim Knaebel , Kadir Yilmaz , Daan de Geus , Alexander Hermans , David Adrian , Timm Linder , Bastian Leibe

Vision Foundation Models (VFMs) have become the cornerstone of modern computer vision, offering robust representations across a wide array of tasks. While recent advances allow these models to handle varying input sizes during training,…

Computer Vision and Pattern Recognition · Computer Science 2026-04-06 Bocheng Zou , Mu Cai , Mark Stanley , Dingfu Lu , Yong Jae Lee

Although large-scale visual foundation models (VFMs) achieve remarkable performance in semantic understanding, they still underperform in instance-aware dense prediction tasks. They exhibit different biases in representation: for instance,…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Yachan Guo , JoseLuis Gomez Zurita , Danna Xue , Yi Xiao , AntonioManuel Lopez Pena

Multi-camera surveillance has been an active research topic for understanding and modeling scenes. Compared to a single camera, multi-cameras provide larger field-of-view and more object cues, and the related applications are multi-view…

Computer Vision and Pattern Recognition · Computer Science 2022-05-03 Qi Zhang , Antoni B. Chan

Recent advances in large-scale pretraining have yielded visual foundation models with strong capabilities. Not only can recent models generalize to arbitrary images for their training task, their intermediate representations are useful for…

Computer Vision and Pattern Recognition · Computer Science 2024-04-15 Mohamed El Banani , Amit Raj , Kevis-Kokitsi Maninis , Abhishek Kar , Yuanzhen Li , Michael Rubinstein , Deqing Sun , Leonidas Guibas , Justin Johnson , Varun Jampani

Foundation models have garnered increasing attention for representation learning in remote sensing. Many such foundation models adopt approaches that have demonstrated success in computer vision with minimal domain-specific modification.…

Computer Vision and Pattern Recognition · Computer Science 2026-01-28 Kevin Lane , Morteza Karimzadeh

In this paper, we analyze the viewpoint stability of foundational models - specifically, their sensitivity to changes in viewpoint- and define instability as significant feature variations resulting from minor changes in viewing angle,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-31 Mateusz Michalkiewicz , Sheena Bai , Mahsa Baktashmotlagh , Varun Jampani , Guha Balakrishnan

In recent years, 3D vision has become a crucial field within computer vision, powering a wide range of applications such as autonomous driving, robotics, augmented reality, and medical imaging. This field relies on accurate perception,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-02 Zhen Wang , Dongyuan Li , Yaozu Wu , Tianyu He , Jiang Bian , Renhe Jiang

Generating consistent multiple views for 3D reconstruction tasks is still a challenge to existing image-to-3D diffusion models. Generally, incorporating 3D representations into diffusion model decrease the model's speed as well as…

Computer Vision and Pattern Recognition · Computer Science 2024-06-14 Emmanuelle Bourigault , Pauline Bourigault

3D visual perception tasks based on multi-camera images are essential for autonomous driving systems. Latest work in this field performs 3D object detection by leveraging multi-view images as an input and iteratively enhancing object…

Computer Vision and Pattern Recognition · Computer Science 2023-07-31 Jongwoo Park , Apoorv Singh , Varun Bankiti

The development of medical vision-language foundation models has attracted significant attention in the field of medicine and healthcare due to their promising prospect in various clinical applications. While previous studies have commonly…

Computer Vision and Pattern Recognition · Computer Science 2024-02-27 Weijian Huang , Cheng Li , Hong-Yu Zhou , Jiarun Liu , Hao Yang , Yong Liang , Guangming Shi , Hairong Zheng , Shanshan Wang

Segment matching is an important intermediate task in computer vision that establishes correspondences between semantically or geometrically coherent regions across images. Unlike keypoint matching, which focuses on localized features,…

Computer Vision and Pattern Recognition · Computer Science 2025-10-27 Rohit Jayanti , Swayam Agrawal , Vansh Garg , Siddharth Tourani , Muhammad Haris Khan , Sourav Garg , Madhava Krishna

Humans understand the world through the integration of multiple sensory modalities, enabling them to perceive, reason about, and imagine dynamic physical processes. Inspired by this capability, multimodal foundation models (MFMs) have…

Artificial Intelligence · Computer Science 2025-10-07 Xuehai He

Vision Foundation Models (VFMs) have become a de facto choice for many downstream vision tasks, like image classification, image segmentation, and object localization. However, they can also provide significant utility for downstream 3D…

Computer Vision and Pattern Recognition · Computer Science 2025-04-22 Johannes Spoecklberger , Wei Lin , Pedro Hermosilla , Sivan Doveh , Horst Possegger , M. Jehanzeb Mirza

Foundation models (FMs) are changing the way medical images are analyzed by learning from large collections of unlabeled data. Instead of relying on manually annotated examples, FMs are pre-trained to learn general-purpose visual features…

‹ Prev 1 2 3 10 Next ›