Learning To Evaluate Performance Of Multi-modal Semantic Localization
2022 Β· Zhiqiang Yuan, Wenkai Zhang, Chongyang Li, et al.
Abstract
Semantic localization (SeLo) refers to the task of obtaining the most relevant locations in large-scale remote sensing (RS) images using semantic information such as text. As an emerging task based on cross-modal retrieval, SeLo achieves semantic-level retrieval with only caption-level annotation, which demonstrates its great potential in unifying downstream tasks. Although SeLo has been carried out successively, but there is currently no work has systematically explores and analyzes this urgent direction. In this paper, we thoroughly study this field and provide a complete benchmark in terms of metrics and testdata to advance the SeLo task. Firstly, based on the characteristics of this task, we propose multiple discriminative evaluation metrics to quantify the performance of the SeLo task. The devised significant area proportion, attention shift distance, and discrete attention distance are utilized to evaluate the generated SeLo map from pixel-level and region-level. Next, to provide
Authors
(none)
Tags
Stats
Related papers
- Semantic Signatures For Large-scale Visual Localization (2020)5.24
- Retrieval And Localization With Observation Constraints (2021)5.24
- SLAN: Self-locator Aided Network For Cross-modal Understanding (2022)0.00
- Semantic Pose Verification For Outdoor Visual Localization With Self-supervised Contrastive Learning (2022)8.35
- Uniloc: Towards Universal Place Recognition Using Any Single Modality (2024)0.00
- Do Cross Modal Systems Leverage Semantic Relationships? (2019)7.16
- DASGIL: Domain Adaptation For Semantic And Geometric-aware Image-based Localization (2020)13.11
- Content-based Landmark Retrieval Combining Global And Local Features Using Siamese Neural Networks (2022)0.00