Cross-modal Adaptive Dual Association For Text-to-image Person Retrieval
2023 Β· Dixuan Lin, Yixing Peng, Jingke Meng, et al.
Abstract
Text-to-image person re-identification (ReID) aims to retrieve images of a person based on a given textual description. The key challenge is to learn the relations between detailed information from visual and textual modalities. Existing works focus on learning a latent space to narrow the modality gap and further build local correspondences between two modalities. However, these methods assume that image-to-text and text-to-image associations are modality-agnostic, resulting in suboptimal associations. In this work, we show the discrepancy between image-to-text association and text-to-image association and propose CADA: Cross-Modal Adaptive Dual Association that finely builds bidirectional image-text detailed associations. Our approach features a decoder-based adaptive dual association module that enables full interaction between visual and textual modalities, allowing for bidirectional and adaptive cross-modal correspondence associations. Specifically, the paper proposes a bidirectio
Authors
(none)
Tags
Stats
Related papers
- Cross-modal Implicit Relation Reasoning And Aligning For Text-to-image Person Retrieval (2023)18.15
- Prototype-guided Cross-modal Completion And Alignment For Incomplete Text-based Person Re-identification (2023)6.77
- Dico: Disentangled Concept Representation For Text-to-image Person Re-identification (2026)4.65
- Dynamic Dual-attentive Aggregation Learning For Visible-infrared Person Re-identification (2020)19.67
- Multi-path Exploration And Feedback Adjustment For Text-to-image Person Retrieval (2024)0.00
- Multilingual Text-to-image Person Retrieval Via Bidirectional Relation Reasoning And Aligning (2025)2.35
- Deep Co-attention Based Comparators For Relative Representation Learning In Person Re-identification (2018)13.34
- Bridging The Gap: Multi-level Cross-modality Joint Alignment For Visible-infrared Person Re-identification (2023)11.29