Modality-aware Feature Matching: A Comprehensive Review Of Single- And Cross-modality Techniques
2025 Β· Weide Liu, Wei Zhou, Jun Liu, et al.
Abstract
Feature matching is a cornerstone task in computer vision, essential for applications such as image retrieval, stereo matching, 3D reconstruction, and SLAM. This survey comprehensively reviews modality-based feature matching, exploring traditional handcrafted methods and emphasizing contemporary deep learning approaches across various modalities, including RGB images, depth images, 3D point clouds, LiDAR scans, medical images, and vision-language interactions. Traditional methods, leveraging detectors like Harris corners and descriptors such as SIFT and ORB, demonstrate robustness under moderate intra-modality variations but struggle with significant modality gaps. Contemporary deep learning-based methods, exemplified by detector-free strategies like CNN-based SuperPoint and transformer-based LoFTR, substantially improve robustness and adaptability across modalities. We highlight modality-aware advancements, such as geometric and depth-specific descriptors for depth images, sparse and
Authors
(none)
Tags
Stats
Related papers
- Local Feature Matching Using Deep Learning: A Survey (2024)18.68
- Cross-modal Retrieval: A Systematic Review Of Methods And Future Directions (2023)12.81
- New Ideas And Trends In Deep Multimodal Content Understanding: A Review (2020)12.10
- Cross-domain Visual Matching Via Generalized Similarity Measure And Feature Learning (2016)15.54
- Mifnet: Learning Modality-invariant Features For Generalizable Multimodal Image Matching (2025)8.90
- Multimodal Representation Alignment For Cross-modal Information Retrieval (2025)0.00
- MUST: An Effective And Scalable Framework For Multimodal Search Of Target Modality (2023)7.81
- Modalink: Unifying Modalities For Efficient Image-to-pointcloud Place Recognition (2024)9.02