Efficient Image-text Retrieval Via Keyword-guided Pre-screening
2023 Β· Min Cao, Yang Bai, Jingyao Wang, et al.
Abstract
Under the flourishing development in performance, current image-text retrieval methods suffer from \(N\)-related time complexity, which hinders their application in practice. Targeting at efficiency improvement, this paper presents a simple and effective keyword-guided pre-screening framework for the image-text retrieval. Specifically, we convert the image and text data into the keywords and perform the keyword matching across modalities to exclude a large number of irrelevant gallery samples prior to the retrieval network. For the keyword prediction, we transfer it into a multi-label classification problem and propose a multi-task learning scheme by appending the multi-label classifiers to the image-text retrieval network to achieve a lightweight and high-performance keyword prediction. For the keyword matching, we introduce the inverted index in the search engine and create a win-win situation on both time and space complexities for the pre-screening. Extensive experiments on two wid
Authors
(none)
Tags
Stats
Related papers
- Lexlip: Lexicon-bottlenecked Language-image Pre-training For Large-scale Image-text Retrieval (2023)10.85
- Leaner And Faster: Two-stage Model Compression For Lightweight Text-image Retrieval (2022)6.34
- Image-text Retrieval Via Preserving Main Semantics Of Vision (2023)10.22
- Enhancing Image Retrieval : A Comprehensive Study On Photo Search Using The CLIP Mode (2024)0.00
- ARTEMIS: Attention-based Retrieval With Text-explicit Matching And Implicit Similarity (2022)0.00
- Revising Image-text Retrieval Via Multi-modal Entailment (2022)0.00
- Focus, Distinguish, And Prompt: Unleashing CLIP For Efficient And Flexible Scene Text Retrieval (2024)8.80
- Learnable Pillar-based Re-ranking For Image-text Retrieval (2023)9.92