\(r^{2}\)former: Unified \(r\)etrieval And \(r\)eranking Transformer For Place Recognition
2023 Β· Sijie Zhu, Linjie Yang, Chen Chen, et al.
Abstract
Visual Place Recognition (VPR) estimates the location of query images by matching them with images in a reference database. Conventional methods generally adopt aggregated CNN features for global retrieval and RANSAC-based geometric verification for reranking. However, RANSAC only employs geometric information but ignores other possible information that could be useful for reranking, e.g. local feature correlations, and attention values. In this paper, we propose a unified place recognition framework that handles both retrieval and reranking with a novel transformer model, named \(R^\{2\}\)Former. The proposed reranking module takes feature correlation, attention value, and xy coordinates into account, and learns to determine whether the image pair is from the same location. The whole pipeline is end-to-end trainable and the reranking module alone can also be adopted on other CNN or transformer backbones as a generic component. Remarkably, \(R^\{2\}\)Former significantly outperforms st
Authors
(none)
Tags
Stats
Related papers
- Placeformer: Transformer-based Visual Place Recognition Using Multi-scale Patch Selection And Fusion (2024)7.81
- Unipr-3d: Towards Universal Visual Place Recognition With Visual Geometry Grounded Transformer (2025)2.95
- Towards Implicit Aggregation: Robust Image Representation For Place Recognition In The Transformer Era (2025)3.09
- Instance-level Image Retrieval Using Reranking Transformers (2021)19.00
- Regressing Transformers For Data-efficient Visual Place Recognition (2024)3.58
- Evaluation Of Visual Place Recognition Methods For Image Pair Retrieval In 3D Vision And Robotics (2026)0.00
- Multires-netvlad: Augmenting Place Recognition Training With Low-resolution Imagery (2022)16.01
- Spatio-semantic Convnet-based Visual Place Recognition (2019)10.21