GSSF: Generalized Structural Sparse Function For Deep Cross-modal Metric Learning
2024 Β· Haiwen Diao, Ying Zhang, Shang Gao, et al.
Abstract
Cross-modal metric learning is a prominent research topic that bridges the semantic heterogeneity between vision and language. Existing methods frequently utilize simple cosine or complex distance metrics to transform the pairwise features into a similarity score, which suffers from an inadequate or inefficient capability for distance measurements. Consequently, we propose a Generalized Structural Sparse Function to dynamically capture thorough and powerful relationships across modalities for pair-wise similarity learning while remaining concise but efficient. Specifically, the distance metric delicately encapsulates two formats of diagonal and block-diagonal terms, automatically distinguishing and highlighting the cross-channel relevancy and dependency inside a structured and organized topology. Hence, it thereby empowers itself to adapt to the optimal matching patterns between the paired features and reaches a sweet spot between model complexity and capability. Extensive experiments
Authors
(none)
Tags
Stats
Related papers
- Sharing Matters For Generalization In Deep Metric Learning (2020)8.35
- Semantic Granularity Metric Learning For Visual Search (2019)7.81
- Cross-domain Visual Matching Via Generalized Similarity Measure And Feature Learning (2016)15.54
- Deep Metric Structured Learning For Facial Expression Recognition (2020)0.00
- Signal-to-noise Ratio: A Robust Distance Metric For Deep Metric Learning (2019)13.60
- Discriminative Supervised Subspace Learning For Cross-modal Retrieval (2022)0.00
- Towards Interpretable Deep Metric Learning With Structural Matching (2021)15.87
- A New Similarity Space Tailored For Supervised Deep Metric Learning (2020)3.58