Integrating Language Guidance Into Vision-based Deep Metric Learning
2022 Β· Karsten Roth, Oriol Vinyals, Zeynep Akata
Abstract
Deep Metric Learning (DML) proposes to learn metric spaces which encode semantic similarities as embedding space distances. These spaces should be transferable to classes beyond those seen during training. Commonly, DML methods task networks to solve contrastive ranking tasks defined over binary class assignments. However, such approaches ignore higher-level semantic relations between the actual classes. This causes learned embedding spaces to encode incomplete semantic context and misrepresent the semantic relation between classes, impacting the generalizability of the learned metric space. To tackle this issue, we propose a language guidance objective for visual similarity learning. Leveraging language embeddings of expert- and pseudo-classnames, we contextualize and realign visual representation spaces corresponding to meaningful language semantics for better semantic consistency. Extensive experiments and ablations provide a strong motivation for our proposed approach and show lang
Authors
(none)
Tags
Stats
Related papers
- Guided Deep Metric Learning (2022)6.77
- Improving Deep Metric Learning By Divide And Conquer (2021)8.09
- Revisiting Training Strategies And Generalization Performance In Deep Metric Learning (2020)5.08
- Learning With Memory-based Virtual Classes For Deep Metric Learning (2021)9.92
- Diva: Diverse Visual Feature Aggregation For Deep Metric Learning (2020)11.58
- Indirect: Language-guided Zero-shot Deep Metric Learning For Images (2022)5.24
- S2SD: Simultaneous Similarity-based Self-distillation For Deep Metric Learning (2020)3.31
- A Framework To Enhance Generalization Of Deep Metric Learning Methods Using General Discriminative Feature Learning And Class Adversarial Neural Networks (2021)7.50