E-VQA
Emerging9papers using it
14HF downloads
0HF likes
2022first seen
The E-VQA dataset is used to evaluate knowledge-based visual question answering by assessing the integration of visual understanding with external knowledge retrieval in multimodal queries.
Papers using E-VQA (9)
- Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-RankingWikiSeeker: Rethinking the Role of Vision-Language Models in Knowledge-Based Visual Question AnsweringLearning to Search: A Decision-Based Agent for Knowledge-Based Visual Question AnsweringWhen RAG Hurts: Diagnosing and Mitigating Attention Distraction in Retrieval-Augmented LVLMsCC-VQA: Conflict- and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question AnsweringReconstruction as a Bridge for Event-Based Visual Question AnsweringKnowledge-based Visual Question Answer with Multimodal Processing, Retrieval and FilteringTowards Reasoning-Aware Explainable VQAMultimodal Rationales for Explainable Visual Question Answering