MTVQA
Emerging3papers using it
2024first seen
MTVQA is a benchmark for multilingual Text-Centric Visual Question Answering that contains 6,778 question-answer pairs across 2,116 images, evaluated to assess AI models in text-centric scene understanding across nine diverse languages.