Iebaker: Improved Remote Sensing Image-text Retrieval Framework Via Eliminate Before Align And Keyword Explicit Reasoning
2025 Β· Yan Zhang, Zhong Ji, Changxu Meng, et al.
Abstract
Recent studies focus on the Remote Sensing Image-Text Retrieval (RSITR), which aims at searching for the corresponding targets based on the given query. Among these efforts, the application of Foundation Models (FMs), such as CLIP, to the domain of remote sensing has yielded encouraging outcomes. However, existing FM based methodologies neglect the negative impact of weakly correlated sample pairs and fail to account for the key distinctions among remote sensing texts, leading to biased and superficial exploration of sample pairs. To address these challenges, we propose an approach named iEBAKER (an Improved Eliminate Before Align strategy with Keyword Explicit Reasoning framework) for RSITR. Specifically, we propose an innovative Eliminate Before Align (EBA) strategy to filter out the weakly correlated sample pairs, thereby mitigating their deviations from optimal embedding space during alignment.Further, two specific schemes are introduced from the perspective of whether local simila
Authors
(none)
Tags
Stats
Related papers
- Transcending Fusion: A Multi-scale Alignment Method For Remote Sensing Image-text Retrieval (2024)11.92
- Robust Remote Sensing Image-text Retrieval With Noisy Correspondence (2026)1.24
- Fast-then-fine: A Two-stage Framework With Multi-granular Representation For Cross-modal Retrieval In Remote Sensing (2026)0.00
- Towards A Multimodal Framework For Remote Sensing Image Change Retrieval And Captioning (2024)8.85
- REJEPA: A Novel Joint-embedding Predictive Architecture For Efficient Remote Sensing Image Retrieval (2025)2.26
- Self-enhancement Improves Text-image Retrieval In Foundation Visual-language Models (2023)1.56
- Remote Sensing Cross-modal Text-image Retrieval Based On Global And Local Information (2022)19.48
- DGTRSD & DGTRS-CLIP: A Dual-granularity Remote Sensing Image-text Dataset And Vision Language Foundation Model For Alignment (2025)2.98