Disentangled Noisy Correspondence Learning
2024 Β· Zhuohang Dang, Minnan Luo, Jihong Wang, et al.
Abstract
Cross-modal retrieval is crucial in understanding latent correspondences across modalities. However, existing methods implicitly assume well-matched training data, which is impractical as real-world data inevitably involves imperfect alignments, i.e., noisy correspondences. Although some works explore similarity-based strategies to address such noise, they suffer from sub-optimal similarity predictions influenced by modality-exclusive information (MEI), e.g., background noise in images and abstract definitions in texts. This issue arises as MEI is not shared across modalities, thus aligning it in training can markedly mislead similarity predictions. Moreover, although intuitive, directly applying previous cross-modal disentanglement methods suffers from limited noise tolerance and disentanglement efficacy. Inspired by the robustness of information bottlenecks against noise, we introduce DisNCL, a novel information-theoretic framework for feature Disentanglement in Noisy Correspondence
Authors
(none)
Tags
Stats
Related papers
- Noisy Correspondence Learning With Self-reinforcing Errors Mitigation (2023)8.09
- Noisy Correspondence Learning With Meta Similarity Correction (2023)11.67
- PCSR: Pseudo-label Consistency-guided Sample Refinement For Noisy Correspondence Learning (2025)0.00
- Neighbor-aware Instance Refining With Noisy Labels For Cross-modal Retrieval (2025)2.26
- INTENT: Invariance And Discrimination-aware Noise Mitigation For Robust Composed Image Retrieval (2026)0.00
- PC\(^2\): Pseudo-classification Based Pseudo-captioning For Noisy Correspondence Learning In Cross-modal Retrieval (2024)9.23
- Conesep: Cone-based Robust Noise-unlearning Compositional Network For Composed Image Retrieval (2026)0.00
- A Unified Optimal Transport Framework For Cross-modal Retrieval With Noisy Labels (2024)5.24