STE-VQA
Emerging6papers using it
2022first seen
The 'STE-VQA' dataset/benchmark is used to evaluate models on their ability to perform Document Visual Question Answering by assessing their understanding of textual semantics, spatial layout, and visual features.
Papers using STE-VQA (6)
- When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMsMGA-VQA: Secure And Interpretable Graph-augmented Visual Question Answering With Memory-guided Protection Against Unauthorized Knowledge UseToward 3D Spatial Reasoning for Human-like Text-based Visual Question
AnsweringTAG: Boosting Text-VQA via Text-aware Visual Question-answer GenerationSceneGATE: Scene-Graph based co-Attention networks for TExt visual
question answeringLocate Then Generate: Bridging Vision and Language with Bounding Box for
Scene-Text VQA