Vlm-guided Visual Place Recognition For Planet-scale Geo-localization
2025 Β· Sania Waheed, Na Min An, Michael Milford, et al.
Abstract
Geo-localization from a single image at planet scale (essentially an advanced or extreme version of the kidnapped robot problem) is a fundamental and challenging task in applications such as navigation, autonomous driving and disaster response due to the vast diversity of locations, environmental conditions, and scene variations. Traditional retrieval-based methods for geo-localization struggle with scalability and perceptual aliasing, while classification-based approaches lack generalization and require extensive training data. Recent advances in vision-language models (VLMs) offer a promising alternative by leveraging contextual understanding and reasoning. However, while VLMs achieve high accuracy, they are often prone to hallucinations and lack interpretability, making them unreliable as standalone solutions. In this work, we propose a novel hybrid geo-localization framework that combines the strengths of VLMs with retrieval-based visual place recognition (VPR) methods. Our approac
Authors
(none)
Tags
Stats
Related papers
- Lavpr: Benchmarking Language And Vision For Place Recognition (2026)2.35
- Focus On Local: Finding Reliable Discriminative Regions For Visual Place Recognition (2025)10.70
- Multires-netvlad: Augmenting Place Recognition Training With Low-resolution Imagery (2022)16.01
- Evaluation Of Visual Place Recognition Methods For Image Pair Retrieval In 3D Vision And Robotics (2026)0.00
- VGGT-MPR: Vggt-enhanced Multimodal Place Recognition In Autonomous Driving Environments (2026)0.00
- Collaborative Visual Place Recognition Through Federated Learning (2024)2.26
- Embodiedplace: Learning Mixture-of-features With Embodied Constraints For Visual Place Recognition (2025)0.00
- Data-efficient Large Scale Place Recognition With Graded Similarity Supervision (2023)16.32