← all datasets

VQA

Canonical
49papers using it
2023first seen

The VQA (Visual Question Answering) dataset contains images paired with questions and answers, and it is used to evaluate the ability of models to understand and reason about visual content in conjunction with natural language queries.

Papers using VQA (49)

VQA β€” datasets β€” multimodal