Contrastive Masked Auto-encoders Based Self-supervised Hashing For 2D Image And 3D Point Cloud Cross-modal Retrieval
2024 Β· Rukai Wei, Heng Cui, Yu Liu, et al.
Abstract
Implementing cross-modal hashing between 2D images and 3D point-cloud data is a growing concern in real-world retrieval systems. Simply applying existing cross-modal approaches to this new task fails to adequately capture latent multi-modal semantics and effectively bridge the modality gap between 2D and 3D. To address these issues without relying on hand-crafted labels, we propose contrastive masked autoencoders based self-supervised hashing (CMAH) for retrieval between images and point-cloud data. We start by contrasting 2D-3D pairs and explicitly constraining them into a joint Hamming space. This contrastive learning process ensures robust discriminability for the generated hash codes and effectively reduces the modality gap. Moreover, we utilize multi-modal auto-encoders to enhance the model's understanding of multi-modal semantics. By completing the masked image/point-cloud data modeling task, the model is encouraged to capture more localized clues. In addition, the proposed multi
Authors
(none)
Tags
Stats
Related papers
- Unsupervised Multi-modal Hashing For Cross-modal Retrieval (2019)8.35
- Cluster-wise Unsupervised Hashing For Cross-modal Similarity Search (2019)11.39
- Semantic-consistent Bidirectional Contrastive Hashing For Noisy Multi-label Cross-modal Retrieval (2025)0.00
- Discriminative Supervised Hashing For Cross-modal Similarity Search (2018)7.81
- Deep Class-guided Hashing For Multi-label Cross-modal Retrieval (2024)6.20
- Weakly-paired Cross-modal Hashing (2019)0.00
- Unsupervised Deep Cross-modality Spectral Hashing (2020)11.39
- Asymmetric Correlation Quantization Hashing For Cross-modal Retrieval (2020)10.74