MELT: Improve Composed Image Retrieval Via The Modification Frequentation-rarity Balance Network
2026 Β· Guozhi Qiu, Zhiwei Chen, Zixu Li, et al.
Abstract
Composed Image Retrieval (CIR) uses a reference image and a modification text as a query to retrieve a target image satisfying the requirement of ``modifying the reference image according to the text instructions''. However, existing CIR methods face two limitations: (1) frequency bias leading to ``Rare Sample Neglect'', and (2) susceptibility of similarity scores to interference from hard negative samples and noise. To address these limitations, we confront two key challenges: asymmetric rare semantic localization and robust similarity estimation under hard negative samples. To solve these challenges, we propose the Modification frEquentation-rarity baLance neTwork MELT. MELT assigns increased attention to rare modification semantics in multimodal contexts while applying diffusion-based denoising to hard negative samples with high similarity scores, enhancing multimodal fusion and matching. Extensive experiments on two CIR benchmarks validate the superior performance of MELT. Codes ar
Authors
(none)
Tags
Stats
Related papers
- HINT: Composed Image Retrieval With Dual-path Compositional Contextualized Network (2026)0.78
- DAFM: Dynamic Adaptive Fusion For Multi-model Collaboration In Composed Image Retrieval (2025)0.00
- Far-net: Multi-stage Fusion Network With Enhanced Semantic Alignment And Adaptive Reconciliation For Composed Image Retrieval (2025)0.00
- Finecir: Explicit Parsing Of Fine-grained Modification Semantics For Composed Image Retrieval (2025)2.16
- NCL-CIR: Noise-aware Contrastive Learning For Composed Image Retrieval (2025)2.26
- TMCIR: Token Merge Benefits Composed Image Retrieval (2025)0.00
- HABIT: Chrono-synergia Robust Progressive Learning Framework For Composed Image Retrieval (2026)2.35
- INTENT: Invariance And Discrimination-aware Noise Mitigation For Robust Composed Image Retrieval (2026)0.00