M-3D-VQA
Emerging4papers using it
2023first seen
The 'M3D-VQA' dataset/benchmark contains volumetric CT images and corresponding natural language questions, and it is used to evaluate visual question answering (VQA) systems in the context of medical imaging.
Papers using M-3D-VQA (4)
- Computed Tomography Visual Question Answering with Cross-modal Feature GraphingMulti-CLIP: Contrastive Vision-Language Pre-training for Question
Answering tasks in 3D ScenesEvaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering
Benchmarks3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding