Graph-based Cross-domain Knowledge Distillation For Cross-dataset Text-to-image Person Retrieval
2025 Β· Bingjun Luo, Jinpeng Wang, Wang Zewen, et al.
Abstract
Video surveillance systems are crucial components for ensuring public safety and management in smart city. As a fundamental task in video surveillance, text-to-image person retrieval aims to retrieve the target person from an image gallery that best matches the given text description. Most existing text-to-image person retrieval methods are trained in a supervised manner that requires sufficient labeled data in the target domain. However, it is common in practice that only unlabeled data is available in the target domain due to the difficulty and cost of data annotation, which limits the generalization of existing methods in practical application scenarios. To address this issue, we propose a novel unsupervised domain adaptation method, termed Graph-Based Cross-Domain Knowledge Distillation (GCKD), to learn the cross-modal feature representation for text-to-image person retrieval in a cross-dataset scenario. The proposed GCKD method consists of two main components. Firstly, a graph-bas
Authors
(none)
Tags
Stats
Related papers
- C2KD: Cross-lingual Cross-modal Knowledge Distillation For Multilingual Text-video Retrieval (2022)8.94
- TEACHTEXT: Crossmodal Generalized Distillation For Text-video Retrieval (2021)15.43
- Conaclip: Exploring Distillation Of Fully-connected Knowledge Interaction Graph For Lightweight Text-image Retrieval (2023)4.52
- Syncdr : Training Cross Domain Retrieval Models With Synthetic Data (2023)0.00
- Domain Adaptation In Multi-view Embedding For Cross-modal Video Retrieval (2021)0.00
- Towards Identity-aware Cross-modal Retrieval: A Dataset And A Baseline (2024)1.56
- Dual Learning With Dynamic Knowledge Distillation And Soft Alignment For Partially Relevant Video Retrieval (2025)2.60
- Text-video Retrieval With Global-local Semantic Consistent Learning (2024)8.75