Remote Sensing Cross-modal Text-image Retrieval Based On Global And Local Information
2022 Β· Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, et al.
Abstract
Cross-modal remote sensing text-image retrieval (RSCTIR) has recently become an urgent research hotspot due to its ability of enabling fast and flexible information extraction on remote sensing (RS) images. However, current RSCTIR methods mainly focus on global features of RS images, which leads to the neglect of local features that reflect target relationships and saliency. In this article, we first propose a novel RSCTIR framework based on global and local information (GaLR), and design a multi-level information dynamic fusion (MIDF) module to efficaciously integrate features of different levels. MIDF leverages local information to correct global information, utilizes global information to supplement local information, and uses the dynamic addition of the two to generate prominent visual representation. To alleviate the pressure of the redundant targets on the graph convolution network (GCN) and to improve the model s attention on salient instances during modeling local features, the
Authors
(none)
Tags
Stats
Related papers
- Exploring A Fine-grained Multiscale Method For Cross-modal Remote Sensing Image Retrieval (2022)16.73
- A Novel Self-supervised Cross-modal Image Retrieval Method In Remote Sensing (2022)8.35
- Transcending Fusion: A Multi-scale Alignment Method For Remote Sensing Image-text Retrieval (2024)11.92
- Fast-then-fine: A Two-stage Framework With Multi-granular Representation For Cross-modal Retrieval In Remote Sensing (2026)0.00
- CMIR-NET : A Deep Learning Based Model For Cross-modal Retrieval In Remote Sensing (2019)13.34
- Towards A Multimodal Framework For Remote Sensing Image Change Retrieval And Captioning (2024)8.85
- Scale-semantic Joint Decoupling Network For Image-text Retrieval In Remote Sensing (2022)8.82
- Robust Remote Sensing Image-text Retrieval With Noisy Correspondence (2026)1.24