MUST: An Effective And Scalable Framework For Multimodal Search Of Target Modality
2023 Β· Mengzhao Wang, Xiangyu Ke, Xiaoliang Xu, et al.
Abstract
We investigate the problem of multimodal search of target modality, where the task involves enhancing a query in a specific target modality by integrating information from auxiliary modalities. The goal is to retrieve relevant objects whose contents in the target modality match the specified multimodal query. The paper first introduces two baseline approaches that integrate techniques from the Database, Information Retrieval, and Computer Vision communities. These baselines either merge the results of separate vector searches for each modality or perform a single-channel vector search by fusing all modalities. However, both baselines have limitations in terms of efficiency and accuracy as they fail to adequately consider the varying importance of fusing information across modalities. To overcome these limitations, the paper proposes a novel framework, called MUST. Our framework employs a hybrid fusion mechanism, combining different modalities at multiple stages. Notably, we leverage ve
Authors
(none)
Tags
Stats
Related papers
- Joint Fusion And Encoding: Advancing Multimodal Retrieval From The Ground Up (2025)0.00
- Mire: Enhancing Multimodal Queries Representation Via Fusion-free Modality Interaction For Multimodal Retrieval (2024)3.81
- Modality Curation: Building Universal Embeddings For Advanced Multimodal Information Retrieval (2025)0.00
- Cross-modal Retrieval: A Systematic Review Of Methods And Future Directions (2023)12.81
- Uniecs: Unified Multimodal E-commerce Search Framework With Gated Cross-modal Fusion (2025)2.60
- Modal-aware Features For Multimodal Hashing (2019)0.00
- MMMORRF: Multimodal Multilingual Modularized Reciprocal Rank Fusion (2025)2.26
- Semantic-enhanced Modality-asymmetric Retrieval For Online E-commerce Search (2025)0.00