Dual-modal Prompting For Sketch-based Image Retrieval
2024 Β· Liying Gao, Bingliang Jiao, Peng Wang, et al.
Abstract
Sketch-based image retrieval (SBIR) associates hand-drawn sketches with their corresponding realistic images. In this study, we aim to tackle two major challenges of this task simultaneously: i) zero-shot, dealing with unseen categories, and ii) fine-grained, referring to intra-category instance-level retrieval. Our key innovation lies in the realization that solely addressing this cross-category and fine-grained recognition task from the generalization perspective may be inadequate since the knowledge accumulated from limited seen categories might not be fully valuable or transferable to unseen target categories. Inspired by this, in this work, we propose a dual-modal prompting CLIP (DP-CLIP) network, in which an adaptive prompting strategy is designed. Specifically, to facilitate the adaptation of our DP-CLIP toward unpredictable target categories, we employ a set of images within the target category and the textual category label to respectively construct a set of category-adaptive
Authors
(none)
Tags
Stats
Related papers
- Elevating All Zero-shot Sketch-based Image Retrieval Through Multimodal Prompt Learning (2024)6.34
- CLIP For All Things Zero-shot Sketch-based Image Retrieval, Fine-grained Or Not (2023)15.54
- Sketch Less For More: On-the-fly Fine-grained Sketch Based Image Retrieval (2020)15.28
- Cross-modal Subspace Learning For Fine-grained Sketch-based Image Retrieval (2017)13.34
- Relation-aware Meta-learning For Zero-shot Sketch-based Image Retrieval (2024)0.00
- Sketch And Text Synergy: Fusing Structural Contours And Descriptive Attributes For Fine-grained Image Retrieval (2026)0.00
- Back To The Drawing Board: Rethinking Scene-level Sketch-based Image Retrieval (2025)0.00
- Doodle To Search: Practical Zero-shot Sketch-based Image Retrieval (2019)16.75