English

Deep Visual Geo-localization Benchmark

Computer Vision and Pattern Recognition 2023-06-13 v2

Abstract

In this paper, we propose a new open-source benchmarking framework for Visual Geo-localization (VG) that allows to build, train, and test a wide range of commonly used architectures, with the flexibility to change individual components of a geo-localization pipeline. The purpose of this framework is twofold: i) gaining insights into how different components and design choices in a VG pipeline impact the final results, both in terms of performance (recall@N metric) and system requirements (such as execution time and memory consumption); ii) establish a systematic evaluation protocol for comparing different methods. Using the proposed framework, we perform a large suite of experiments which provide criteria for choosing backbone, aggregation and negative mining depending on the use-case and requirements. We also assess the impact of engineering techniques like pre/post-processing, data augmentation and image resizing, showing that better performance can be obtained through somewhat simple procedures: for example, downscaling the images' resolution to 80% can lead to similar results with a 36% savings in extraction time and dataset storage requirement. Code and trained models are available at https://deep-vg-bench.herokuapp.com/.

Keywords

Cite

@article{arxiv.2204.03444,
  title  = {Deep Visual Geo-localization Benchmark},
  author = {Gabriele Berton and Riccardo Mereu and Gabriele Trivigno and Carlo Masone and Gabriela Csurka and Torsten Sattler and Barbara Caputo},
  journal= {arXiv preprint arXiv:2204.03444},
  year   = {2023}
}

Comments

CVPR 2022 (Oral)

R2 v1 2026-06-24T10:41:11.979Z