Exploring Masked Autoencoders For Sensor-agnostic Image Retrieval In Remote Sensing
2024 Β· Jakob Hackstein, Gencer Sumbul, Kai Norman Clasen, et al.
Abstract
Self-supervised learning through masked autoencoders (MAEs) has recently attracted great attention for remote sensing (RS) image representation learning, and thus embodies a significant potential for content-based image retrieval (CBIR) from ever-growing RS image archives. However, the existing MAE based CBIR studies in RS assume that the considered RS images are acquired by a single image sensor, and thus are only suitable for uni-modal CBIR problems. The effectiveness of MAEs for cross-sensor CBIR, which aims to search semantically similar images across different image modalities, has not been explored yet. In this paper, we take the first step to explore the effectiveness of MAEs for sensor-agnostic CBIR in RS. To this end, we present a systematic overview on the possible adaptations of the vanilla MAE to exploit masked image modeling on multi-sensor RS image archives (denoted as cross-sensor masked autoencoders [CSMAEs]) in the context of CBIR. Based on different adjustments applie
Authors
(none)
Tags
Stats
Related papers
- Csmoe: An Efficient Remote Sensing Foundation Model With Soft Mixture-of-experts (2025)0.00
- A Novel Self-supervised Cross-modal Image Retrieval Method In Remote Sensing (2022)8.35
- REJEPA: A Novel Joint-embedding Predictive Architecture For Efficient Remote Sensing Image Retrieval (2025)2.26
- Contrastive Audio-visual Masked Autoencoder (2022)4.93
- Exploring A Fine-grained Multiscale Method For Cross-modal Remote Sensing Image Retrieval (2022)16.73
- Class-specific Variational Auto-encoder For Content-based Image Retrieval (2023)4.52
- CMIR-NET : A Deep Learning Based Model For Cross-modal Retrieval In Remote Sensing (2019)13.34
- Challenging Decoder Helps In Masked Auto-encoder Pre-training For Dense Passage Retrieval (2023)0.00