ST-VQA
Emerging6papers using it
150HF downloads
5HF likes
2022first seen
The ST-VQA dataset is used to evaluate Text Visual Question Answering (TextVQA) approaches by providing images along with questions that require understanding both visual content and textual elements within the scene.
Papers using ST-VQA (6)
- When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMsMGA-VQA: Secure And Interpretable Graph-augmented Visual Question Answering With Memory-guided Protection Against Unauthorized Knowledge UseToward 3D Spatial Reasoning for Human-like Text-based Visual Question
AnsweringTAG: Boosting Text-VQA via Text-aware Visual Question-answer GenerationSceneGATE: Scene-Graph based co-Attention networks for TExt visual
question answeringLocate Then Generate: Bridging Vision and Language with Bounding Box for
Scene-Text VQA