Pseudo-triplet Guided Few-shot Composed Image Retrieval
2024 Β· Bohan Hou, Haoqiang Lin, Haokun Wen, et al.
Abstract
Composed Image Retrieval (CIR) is a challenging task that aims to retrieve the target image with a multimodal query, i.e., a reference image, and its complementary modification text. As previous supervised or zero-shot learning paradigms all fail to strike a good trade-off between the model's generalization ability and retrieval performance, recent researchers have introduced the task of few-shot CIR (FS-CIR) and proposed a textual inversion-based network based on pretrained CLIP model to realize it. Despite its promising performance, the approach encounters two key limitations: simply relying on the few annotated samples for CIR model training and indiscriminately selecting training triplets for CIR model fine-tuning. To address these two limitations, we propose a novel two-stage pseudo triplet guided few-shot CIR scheme, dubbed PTG-FSCIR. In the first stage, we propose an attentive masking and captioning-based pseudo triplet generation method, to construct pseudo triplets from pure i
Authors
(none)
Tags
Stats
Related papers
- Triplet Synthesis For Enhancing Composed Image Retrieval Via Counterfactual Image Generation (2025)3.58
- Automatic Synthesis Of High-quality Triplet Data For Composed Image Retrieval (2025)0.00
- From Mapping To Composing: A Two-stage Framework For Zero-shot Composed Image Retrieval (2025)0.00
- Scale Up Composed Image Retrieval Learning Via Modification Text Generation (2025)3.58
- Scaling Prompt Instructed Zero Shot Composed Image Retrieval With Image-only Data (2025)0.00
- Improving Composed Image Retrieval Via Contrastive Learning With Scaling Positives And Negatives (2024)11.30
- Image2sentence Based Asymmetrical Zero-shot Composed Image Retrieval (2024)0.00
- Compositional Image Retrieval Via Instruction-aware Contrastive Learning (2024)0.00