Learning Sequential Descriptors for Sequence-based Visual Place Recognition

Riccardo Mereu; Gabriele Trivigno; Gabriele Berton; Carlo Masone; Barbara Caputo

Learning Sequential Descriptors for Sequence-based Visual Place Recognition

Computer Vision and Pattern Recognition 2022-07-11 v1

Authors: Riccardo Mereu , Gabriele Trivigno , Gabriele Berton , Carlo Masone , Barbara Caputo

Abstract

In robotics, Visual Place Recognition is a continuous process that receives as input a video stream to produce a hypothesis of the robot's current position within a map of known places. This task requires robust, scalable, and efficient techniques for real applications. This work proposes a detailed taxonomy of techniques using sequential descriptors, highlighting different mechanism to fuse the information from the individual images. This categorization is supported by a complete benchmark of experimental results that provides evidence on the strengths and weaknesses of these different architectural choices. In comparison to existing sequential descriptors methods, we further investigate the viability of Transformers instead of CNN backbones, and we propose a new ad-hoc sequence-level aggregator called SeqVLAD, which outperforms prior state of the art on different datasets. The code is available at https://github.com/vandal-vpr/vg-transformers.

Keywords

visual place recognition action recognition visual localization

Cite

@article{arxiv.2207.03868,
  title  = {Learning Sequential Descriptors for Sequence-based Visual Place Recognition},
  author = {Riccardo Mereu and Gabriele Trivigno and Gabriele Berton and Carlo Masone and Barbara Caputo},
  journal= {arXiv preprint arXiv:2207.03868},
  year   = {2022}
}

Comments

Accepted at IROS22

Learning Sequential Descriptors for Sequence-based Visual Place Recognition

Abstract

Keywords

Cite

Comments

Related papers