Optimal Transport Aggregation For Visual Place Recognition
2023 Β· Sergio Izquierdo, Javier Civera
Abstract
The task of Visual Place Recognition (VPR) aims to match a query image against references from an extensive database of images from different places, relying solely on visual cues. State-of-the-art pipelines focus on the aggregation of features extracted from a deep backbone, in order to form a global descriptor for each image. In this context, we introduce SALAD (Sinkhorn Algorithm for Locally Aggregated Descriptors), which reformulates NetVLAD's soft-assignment of local features to clusters as an optimal transport problem. In SALAD, we consider both feature-to-cluster and cluster-to-feature relations and we also introduce a 'dustbin' cluster, designed to selectively discard features deemed non-informative, enhancing the overall descriptor quality. Additionally, we leverage and fine-tune DINOv2 as a backbone, which provides enhanced description power for the local features, and dramatically reduces the required training time. As a result, our single-stage method not only surpasses sin
Authors
(none)
Tags
Stats
Related papers
- Focus On Local: Finding Reliable Discriminative Regions For Visual Place Recognition (2025)10.70
- Vlad-buff: Burst-aware Fast Feature Aggregation For Visual Place Recognition (2024)10.46
- Multires-netvlad: Augmenting Place Recognition Training With Low-resolution Imagery (2022)16.01
- Evaluation Of Visual Place Recognition Methods For Image Pair Retrieval In 3D Vision And Robotics (2026)0.00
- SAGE: Spatial-visual Adaptive Graph Exploration For Efficient Visual Place Recognition (2025)2.16
- Structvpr++: Distill Structural And Semantic Knowledge With Weighting Samples For Visual Place Recognition (2025)3.58
- Towards Test-time Efficient Visual Place Recognition Via Asymmetric Query Processing (2025)0.00
- Graph-based Non-linear Least Squares Optimization For Visual Place Recognition In Changing Environments (2020)7.16