Related papers: Deep Visual Geo-localization Benchmark

ImLoc: Revisiting Visual Localization with Image-based Representation

Existing visual localization methods are typically either 2D image-based, which are easy to build and maintain but limited in effective geometric reasoning, or 3D structure-based, which achieve high accuracy but require a centralized…

Computer Vision and Pattern Recognition · Computer Science 2026-01-08 Xudong Jiang , Fangjinhua Wang , Silvano Galliani , Christoph Vogel , Marc Pollefeys

Visual Localization Using Semantic Segmentation and Depth Prediction

In this paper, we propose a monocular visual localization pipeline leveraging semantic and depth cues. We apply semantic consistency evaluation to rank the image retrieval results and a practical clustering technique to reject estimation…

Computer Vision and Pattern Recognition · Computer Science 2020-05-26 Huanhuan Fan , Yuhao Zhou , Ang Li , Shuang Gao , Jijunnan Li , Yandong Guo

Rethinking Visual Geo-localization for Large-Scale Applications

Visual Geo-localization (VG) is the task of estimating the position where a given photo was taken by comparing it with a large database of images of known locations. To investigate how existing techniques would perform on a real-world…

Computer Vision and Pattern Recognition · Computer Science 2022-04-08 Gabriele Berton , Carlo Masone , Barbara Caputo

You Only Look & Listen Once: Towards Fast and Accurate Visual Grounding

Visual Grounding (VG) aims to locate the most relevant region in an image, based on a flexible natural language query but not a pre-defined label, thus it can be a more useful technique than object detection in practice. Most…

Computer Vision and Pattern Recognition · Computer Science 2019-03-19 Chaorui Deng , Qi Wu , Guanghui Xu , Zhuliang Yu , Yanwu Xu , Kui Jia , Mingkui Tan

A Simple and Better Baseline for Visual Grounding

Visual grounding aims to predict the locations of target objects specified by textual descriptions. For this task with linguistic and visual modalities, there is a latest research line that focuses on only selecting the linguistic-relevant…

Computer Vision and Pattern Recognition · Computer Science 2025-10-14 Jingchao Wang , Wenlong Zhang , Dingjiang Huang , Hong Wang , Yefeng Zheng

Advancing Visual Grounding with Scene Knowledge: Benchmark and Method

Visual grounding (VG) aims to establish fine-grained alignment between vision and language. Ideally, it can be a testbed for vision-and-language models to evaluate their understanding of the images and texts and their reasoning abilities…

Computer Vision and Pattern Recognition · Computer Science 2023-07-24 Zhihong Chen , Ruifei Zhang , Yibing Song , Xiang Wan , Guanbin Li

CurriculumLoc: Enhancing Cross-Domain Geolocalization through Multi-Stage Refinement

Visual geolocalization is a cost-effective and scalable task that involves matching one or more query images, taken at some unknown location, to a set of geo-tagged reference images. Existing methods, devoted to semantic features…

Computer Vision and Pattern Recognition · Computer Science 2025-01-14 Boni Hu , Lin Chen , Runjian Chen , Shuhui Bu , Pengcheng Han , Haowei Li

Viewpoint Invariant Dense Matching for Visual Geolocalization

In this paper we propose a novel method for image matching based on dense local features and tailored for visual geolocalization. Dense local features matching is robust against changes in illumination and occlusions, but not against…

Computer Vision and Pattern Recognition · Computer Science 2021-09-22 Gabriele Berton , Carlo Masone , Valerio Paolicelli , Barbara Caputo

RGBT-Ground Benchmark: Visual Grounding Beyond RGB in Complex Real-World Scenarios

Visual Grounding (VG) aims to localize specific objects in an image according to natural language expressions, serving as a fundamental task in vision-language understanding. However, existing VG benchmarks are mostly derived from datasets…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Tianyi Zhao , Jiawen Xi , Linhui Xiao , Junnan Li , Xue Yang , Maoxun Yuan , Xingxing Wei

A Real-Time Fusion Framework for Long-term Visual Localization

Visual localization is a fundamental task that regresses the 6 Degree Of Freedom (6DoF) poses with image features in order to serve the high precision localization requests in many robotics applications. Degenerate conditions like motion…

Computer Vision and Pattern Recognition · Computer Science 2022-10-19 Yuchen Yang , Xudong Zhang , Shuang Gao , Jixiang Wan , Yishan Ping , Yuyue Liu , Jijunnan Li , Yandong Guo

Image-Based Benchmarking and Visualization for Large-Scale Global Optimization

In the context of optimization, visualization techniques can be useful for understanding the behaviour of optimization algorithms and can even provide a means to facilitate human interaction with an optimizer. Towards this goal, an…

Neural and Evolutionary Computing · Computer Science 2020-07-27 Kyle Robert Harrison , Azam Asilian Bidgoli , Shahryar Rahnamayan , Kalyanmoy Deb

Robust Image Retrieval-based Visual Localization using Kapture

Visual localization tackles the challenge of estimating the camera pose from images by using correspondence analysis between query images and a map. This task is computation and data intensive which poses challenges on thorough evaluation…

Computer Vision and Pattern Recognition · Computer Science 2022-01-10 Martin Humenberger , Yohann Cabon , Nicolas Guerin , Julien Morat , Vincent Leroy , Jérôme Revaud , Philippe Rerole , Noé Pion , Cesar de Souza , Gabriela Csurka

Z3D: Zero-Shot 3D Visual Grounding from Images

3D visual grounding (3DVG) aims to localize objects in a 3D scene based on natural language queries. In this work, we explore zero-shot 3DVG from multi-view images alone, without requiring any geometric supervision or object priors. We…

Computer Vision and Pattern Recognition · Computer Science 2026-02-04 Nikita Drozdov , Andrey Lemeshko , Nikita Gavrilov , Anton Konushin , Danila Rukhovich , Maksim Kolodiazhnyi

Reloc-VGGT: Visual Re-localization with Geometry Grounded Transformer

Visual localization has traditionally been formulated as a pair-wise pose regression problem. Existing approaches mainly estimate relative poses between two images and employ a late-fusion strategy to obtain absolute pose estimates.…

Computer Vision and Pattern Recognition · Computer Science 2025-12-29 Tianchen Deng , Wenhua Wu , Kunzhen Wu , Guangming Wang , Siting Zhu , Shenghai Yuan , Xun Chen , Guole Shen , Zhe Liu , Hesheng Wang

UNav: An Infrastructure-Independent Vision-Based Navigation System for People with Blindness and Low vision

Vision-based localization approaches now underpin newly emerging navigation pipelines for myriad use cases from robotics to assistive technologies. Compared to sensor-based solutions, vision-based localization does not require pre-installed…

Computer Vision and Pattern Recognition · Computer Science 2022-11-21 Anbang Yang , Mahya Beheshti , Todd E Hudson , Rajesh Vedanthan , Wachara Riewpaiboon , Pattanasak Mongkolwat , Chen Feng , John-Ross Rizzo

Learnable Query Aggregation with KV Routing for Cross-view Geo-localisation

Cross-view geo-localisation (CVGL) aims to estimate the geographic location of a query image by matching it with images from a large-scale database. However, the significant view-point discrepancies present considerable challenges for…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Hualin Ye , Bingxi Liu , Jixiang Du , Yu Qin , Ziyi Chen , Hong Zhang

Lightweight framework for underground pipeline recognition and spatial localization based on multi-view 2D GPR images

To address the issues of weak correlation between multi-view features, low recognition accuracy of small-scale targets, and insufficient robustness in complex scenarios in underground pipeline detection using 3D GPR, this paper proposes a…

Computer Vision and Pattern Recognition · Computer Science 2025-12-29 Haotian Lv , Chao Li , Jiangbo Dai , Yuhui Zhang , Zepeng Fan , Yiqiu Tan , Dawei Wang , Binglei Xie

High-Fidelity Visual Structural Inspections through Transformers and Learnable Resizers

Visual inspection is the predominant technique for evaluating the condition of civil infrastructure. The recent advances in unmanned aerial vehicles (UAVs) and artificial intelligence have made the visual inspections faster, safer, and more…

Image and Video Processing · Electrical Eng. & Systems 2022-10-25 Kareem Eltouny , Seyedomid Sajedi , Xiao Liang

ConGeo: Robust Cross-view Geo-localization across Ground View Variations

Cross-view geo-localization aims at localizing a ground-level query image by matching it to its corresponding geo-referenced aerial view. In real-world scenarios, the task requires accommodating diverse ground images captured by users with…

Computer Vision and Pattern Recognition · Computer Science 2024-09-06 Li Mi , Chang Xu , Javiera Castillo-Navarro , Syrielle Montariol , Wen Yang , Antoine Bosselut , Devis Tuia

ProGEO: Generating Prompts through Image-Text Contrastive Learning for Visual Geo-localization

Visual Geo-localization (VG) refers to the process to identify the location described in query images, which is widely applied in robotics field and computer vision tasks, such as autonomous driving, metaverse, augmented reality, and SLAM.…

Computer Vision and Pattern Recognition · Computer Science 2024-06-05 Chen Mao , Jingqi Hu