Indirect: Language-guided Zero-shot Deep Metric Learning For Images
2022 Β· Konstantin Kobs, Michael Steininger, Andreas Hotho
Abstract
Common Deep Metric Learning (DML) datasets specify only one notion of similarity, e.g., two images in the Cars196 dataset are deemed similar if they show the same car model. We argue that depending on the application, users of image retrieval systems have different and changing similarity notions that should be incorporated as easily as possible. Therefore, we present Language-Guided Zero-Shot Deep Metric Learning (LanZ-DML) as a new DML setting in which users control the properties that should be important for image representations without training data by only using natural language. To this end, we propose InDiReCT (Image representations using Dimensionality Reduction on CLIP embedded Texts), a model for LanZ-DML on images that exclusively uses a few text prompts for training. InDiReCT utilizes CLIP as a fixed feature extractor for images and texts and transfers the variation in text prompt embeddings to the image embedding space. Extensive experiments on five datasets and overall t
Authors
(none)
Tags
Stats
Related papers
- Guided Deep Metric Learning (2022)6.77
- Integrating Language Guidance Into Vision-based Deep Metric Learning (2022)14.04
- Hybrid-attention Based Decoupled Metric Learning For Zero-shot Image Retrieval (2019)12.93
- Directional Statistics-based Deep Metric Learning For Image Classification And Retrieval (2018)13.05
- S2SD: Simultaneous Similarity-based Self-distillation For Deep Metric Learning (2020)3.31
- Improving Deep Metric Learning By Divide And Conquer (2021)8.09
- Revisiting Training Strategies And Generalization Performance In Deep Metric Learning (2020)5.08
- Introspective Deep Metric Learning For Image Retrieval (2022)3.50