Mifnet: Learning Modality-invariant Features For Generalizable Multimodal Image Matching
2025 Β· Yepeng Liu, Zhichao Sun, Baosheng Yu, et al.
Abstract
Many keypoint detection and description methods have been proposed for image matching or registration. While these methods demonstrate promising performance for single-modality image matching, they often struggle with multimodal data because the descriptors trained on single-modality data tend to lack robustness against the non-linear variations present in multimodal data. Extending such methods to multimodal image matching often requires well-aligned multimodal data to learn modality-invariant descriptors. However, acquiring such data is often costly and impractical in many real-world scenarios. To address this challenge, we propose a modality-invariant feature learning network (MIFNet) to compute modality-invariant features for keypoint descriptions in multimodal image matching using only single-modality training data. Specifically, we propose a novel latent feature aggregation module and a cumulative hybrid aggregation module to enhance the base keypoint descriptors trained on singl
Authors
(none)
Tags
Stats
Related papers
- If-net: An Illumination-invariant Feature Network (2020)6.77
- Modality-aware Feature Matching: A Comprehensive Review Of Single- And Cross-modality Techniques (2025)0.00
- Modality Curation: Building Universal Embeddings For Advanced Multimodal Information Retrieval (2025)0.00
- Modal-aware Features For Multimodal Hashing (2019)0.00
- Enhancing Image-text Matching With Adaptive Feature Aggregation (2024)6.34
- Mdreid: Modality-decoupled Learning For Any-to-any Multi-modal Object Re-identification (2025)0.00
- Mire: Enhancing Multimodal Queries Representation Via Fusion-free Modality Interaction For Multimodal Retrieval (2024)3.81
- Indexing Multimodal Language Models For Large-scale Image Retrieval (2026)0.00