Beat: Bi-directional One-to-many Embedding Alignment For Text-based Person Retrieval
2024 Β· Yiwei Ma, Xiaoshuai Sun, Jiayi Ji, et al.
Abstract
Text-based person retrieval (TPR) is a challenging task that involves retrieving a specific individual based on a textual description. Despite considerable efforts to bridge the gap between vision and language, the significant differences between these modalities continue to pose a challenge. Previous methods have attempted to align text and image samples in a modal-shared space, but they face uncertainties in optimization directions due to the movable features of both modalities and the failure to account for one-to-many relationships of image-text pairs in TPR datasets. To address this issue, we propose an effective bi-directional one-to-many embedding paradigm that offers a clear optimization direction for each sample, thus mitigating the optimization problem. Additionally, this embedding scheme generates multiple features for each sample without introducing trainable parameters, making it easier to align with several positive samples. Based on this paradigm, we propose a novel Bi-d
Authors
(none)
Tags
Stats
Related papers
- Multilingual Text-to-image Person Retrieval Via Bidirectional Relation Reasoning And Aligning (2025)2.35
- Cross-modal Full-mode Fine-grained Alignment For Text-to-image Person Retrieval (2025)2.23
- TIPCB: A Simple But Effective Part-based Convolutional Baseline For Text-based Person Search (2021)20.24
- Text-guided Image Restoration And Semantic Enhancement For Text-to-image Person Retrieval (2023)9.00
- Multi-path Exploration And Feedback Adjustment For Text-to-image Person Retrieval (2024)0.00
- Boosting Weak Positives For Text Based Person Search (2025)0.00
- Decoupled Cross-modal Alignment Network For Text-rgbt Person Retrieval And A High-quality Benchmark (2025)0.00
- Improving Text-based Person Search Via Part-level Cross-modal Correspondence (2024)0.00