Needle: A Generative Ai-powered Multi-modal Database For Answering Complex Natural Language Queries
2024 Β· Mahdi Erfanian, Mohsen Dehghankar, Abolfazl Asudeh
Abstract
Multi-modal datasets, like those involving images, often miss the detailed descriptions that properly capture the rich information encoded in each item. This makes answering complex natural language queries a major challenge in this domain. In particular, unlike the traditional nearest neighbor search, where the tuples and the query are represented as points in a single metric space, these settings involve queries and tuples embedded in fundamentally different spaces, making the traditional query answering methods inapplicable. Existing literature addresses this challenge for image datasets through vector representations jointly trained on natural language and images. This technique, however, underperforms for complex queries due to various reasons. This paper takes a step towards addressing this challenge by introducing a Generative-based Monte Carlo method that utilizes foundation models to generate synthetic samples that capture the complexity of the natural language query and rep
Authors
(none)
Tags
Stats
Related papers
- Needledb: A Generative-ai Based System For Accurate And Efficient Image Retrieval Using Complex Natural Language Queries (2026)0.00
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text (2022)14.66
- Multimodal Neural Databases (2023)10.74
- I Want This Product But Different : Multimodal Retrieval With Synthetic Query Expansion (2021)0.00
- Multimodal Hypothetical Summary For Retrieval-based Multi-image Question Answering (2024)0.00
- Multimodal Needle In A Haystack: Benchmarking Long-context Capability Of Multimodal Large Language Models (2024)11.84
- Tiger: Unifying Text-to-image Generation And Retrieval With Large Multimodal Models (2024)0.00
- Generative Retrieval As Multi-vector Dense Retrieval (2024)8.60