← all datasets

MTVQA

Emerging
3papers using it
2024first seen

MTVQA is a benchmark for multilingual Text-Centric Visual Question Answering that contains 6,778 question-answer pairs across 2,116 images, evaluated to assess AI models in text-centric scene understanding across nine diverse languages.

Papers using MTVQA (3)

MTVQA β€” datasets β€” multimodal