Visual Question Answering
Emerging6papers using it
2024first seen
Visual Question Answering (VQA) is a benchmark that evaluates the ability of models to answer questions about images, using both multiple-choice and caption-based tasks.
Papers using Visual Question Answering (6)
- Cross-Modal Attention Guided Unlearning in Vision-Language ModelsDo LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMsTowards Resource-efficient Multimodal Intelligence: Learned Routing Among Specialized Expert ModelsAdaptive Token Boundaries: Integrating Human Chunking Mechanisms into
Multimodal LLMsUncertainty-Aware Evaluation for Vision-Language ModelsBoth Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM