A Unified Optimal Transport Framework For Cross-modal Retrieval With Noisy Labels
2024 Β· Haochen Han, Minnan Luo, Huan Liu, et al.
Abstract
Cross-modal retrieval (CMR) aims to establish interaction between different modalities, among which supervised CMR is emerging due to its flexibility in learning semantic category discrimination. Despite the remarkable performance of previous supervised CMR methods, much of their success can be attributed to the well-annotated data. However, even for unimodal data, precise annotation is expensive and time-consuming, and it becomes more challenging with the multimodal scenario. In practice, massive multimodal data are collected from the Internet with coarse annotation, which inevitably introduces noisy labels. Training with such misleading labels would bring two key challenges -- enforcing the multimodal samples to *align incorrect semantics* and *widen the heterogeneous gap*, resulting in poor retrieval performance. To tackle these challenges, this work proposes UOT-RCL, a Unified framework based on Optimal Transport (OT) for Robust Cross-modal Retrieval. First, we propose a semantic a
Authors
(none)
Tags
Stats
Related papers
- Dual-view Curricular Optimal Transport For Cross-lingual Cross-modal Retrieval (2023)9.03
- Neighbor-aware Instance Refining With Noisy Labels For Cross-modal Retrieval (2025)2.26
- Learning To Rematch Mismatched Pairs For Robust Cross-modal Retrieval (2024)13.82
- Unsupervised Cross-domain Image Retrieval Via Prototypical Optimal Transport (2024)8.09
- Deep Reversible Consistency Learning For Cross-modal Retrieval (2025)7.81
- Semantic-consistent Bidirectional Contrastive Hashing For Noisy Multi-label Cross-modal Retrieval (2025)0.00
- Adversarial Cross-modal Retrieval Via Learning And Transferring Single-modal Similarities (2019)8.60
- Robust Self-paced Hashing For Cross-modal Retrieval With Noisy Labels (2025)7.81