Scale-semantic Joint Decoupling Network For Image-text Retrieval In Remote Sensing
2022 Β· Chengyu Zheng, Ning Song, Ruoyu Zhang, et al.
Abstract
Image-text retrieval in remote sensing aims to provide flexible information for data analysis and application. In recent years, state-of-the-art methods are dedicated to ``scale decoupling'' and ``semantic decoupling'' strategies to further enhance the capability of representation. However, these previous approaches focus on either the disentangling scale or semantics but ignore merging these two ideas in a union model, which extremely limits the performance of cross-modal retrieval models. To address these issues, we propose a novel Scale-Semantic Joint Decoupling Network (SSJDN) for remote sensing image-text retrieval. Specifically, we design the Bidirectional Scale Decoupling (BSD) module, which exploits Salience Feature Extraction (SFE) and Salience-Guided Suppression (SGS) units to adaptively extract potential features and suppress cumbersome features at other scales in a bidirectional pattern to yield different scale clues. Besides, we design the Label-supervised Semantic Decoupl
Authors
(none)
Tags
Stats
Related papers
- Exploring A Fine-grained Multiscale Method For Cross-modal Remote Sensing Image Retrieval (2022)16.73
- Transcending Fusion: A Multi-scale Alignment Method For Remote Sensing Image-text Retrieval (2024)11.92
- Remote Sensing Cross-modal Text-image Retrieval Based On Global And Local Information (2022)19.48
- Direction-oriented Visual-semantic Embedding Model For Remote Sensing Image-text Retrieval (2023)11.29
- Large Language Models For Captioning And Retrieving Remote Sensing Images (2024)0.00
- Deep Unsupervised Contrastive Hashing For Large-scale Cross-modal Text-image Retrieval In Remote Sensing (2022)0.00
- Deep Hashing Learning For Visual And Semantic Retrieval Of Remote Sensing Images (2019)13.55
- Fast-then-fine: A Two-stage Framework With Multi-granular Representation For Cross-modal Retrieval In Remote Sensing (2026)0.00