REJEPA: A Novel Joint-embedding Predictive Architecture For Efficient Remote Sensing Image Retrieval
2025 Β· Shabnam Choudhury, Yash Salunkhe, Sarthak Mehrotra, et al.
Abstract
The rapid expansion of remote sensing image archives demands the development of strong and efficient techniques for content-based image retrieval (RS-CBIR). This paper presents REJEPA (Retrieval with Joint-Embedding Predictive Architecture), an innovative self-supervised framework designed for unimodal RS-CBIR. REJEPA utilises spatially distributed context token encoding to forecast abstract representations of target tokens, effectively capturing high-level semantic features and eliminating unnecessary pixel-level details. In contrast to generative methods that focus on pixel reconstruction or contrastive techniques that depend on negative pairs, REJEPA functions within feature space, achieving a reduction in computational complexity of 40-60% when compared to pixel-reconstruction baselines like Masked Autoencoders (MAE). To guarantee strong and varied representations, REJEPA incorporates Variance-Invariance-Covariance Regularisation (VICReg), which prevents encoder collapse by promoti
Authors
(none)
Tags
Stats
Related papers
- Efficient Discriminative Joint Encoders For Large Scale Vision-language Reranking (2025)0.00
- Exploring Masked Autoencoders For Sensor-agnostic Image Retrieval In Remote Sensing (2024)10.74
- A Novel Self-supervised Cross-modal Image Retrieval Method In Remote Sensing (2022)8.35
- Vlm2geovec: Toward Universal Multimodal Embeddings For Remote Sensing (2025)0.00
- RREH: Reconstruction Relations Embedded Hashing For Semi-paired Cross-modal Retrieval (2024)2.26
- Deep Learning For Image Search And Retrieval In Large Remote Sensing Archives (2020)10.74
- Composed Image Retrieval For Remote Sensing (2024)11.03
- Iebaker: Improved Remote Sensing Image-text Retrieval Framework Via Eliminate Before Align And Keyword Explicit Reasoning (2025)2.86