The Devil Is In The Middle: Exploiting Mid-level Representations For Cross-domain Instance Matching
2017 Β· Qian Yu, Xiaobin Chang, Yi-Zhe Song, et al.
Abstract
Many vision problems require matching images of object instances across different domains. These include fine-grained sketch-based image retrieval (FG-SBIR) and Person Re-identification (person ReID). Existing approaches attempt to learn a joint embedding space where images from different domains can be directly compared. In most cases, this space is defined by the output of the final layer of a deep neural network (DNN), which primarily contains features of a high semantic level. In this paper, we argue that both high and mid-level features are relevant for cross-domain instance matching (CDIM). Importantly, mid-level features already exist in earlier layers of the DNN. They just need to be extracted, represented, and fused properly with the final layer. Based on this simple but powerful idea, we propose a unified framework for CDIM. Instantiating our framework for FG-SBIR and ReID, we show that our simple models can easily beat the state-of-the-art models, which are often equipped wi
Authors
(none)
Tags
Stats
Related papers
- Cross-domain Visual Matching Via Generalized Similarity Measure And Feature Learning (2016)15.54
- Caption-matching: A Multimodal Approach For Cross-domain Image Retrieval (2024)0.00
- Devil's In The Details: Aligning Visual Clues For Conditional Embedding In Person Re-identification (2020)0.00
- IDMR: Towards Instance-driven Precise Visual Correspondence In Multimodal Retrieval (2025)2.29
- Bridging The Gap: Multi-level Cross-modality Joint Alignment For Visible-infrared Person Re-identification (2023)11.29
- Cross-domain Image Matching With Deep Feature Maps (2018)12.33
- Deep Co-attention Based Comparators For Relative Representation Learning In Person Re-identification (2018)13.34
- Unified Representation Learning For Cross Model Compatibility (2020)5.24