Visual Genome
Canonical9papers using it
2024first seen
Papers using Visual Genome (9)
- PRISM-0: A Predicate-Rich Scene Graph Generation Framework for Zero-Shot Open-Vocabulary TasksGood Scores, Bad Data: A Metric for Multimodal CoherenceInvestigating Spatial Attention Bias in Vision-Language ModelsMultimodal Arabic Captioning With Interpretable Visual Concept IntegrationDynamic Context-aware Scene Reasoning Using Vision-language Alignment In Zero-shot Real-world ScenariosCompositional Image-Text Matching and Retrieval by Grounding EntitiesDualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large
Language ModelsProgressive Multi-granular Alignments for Grounded Reasoning in Large
Vision-Language ModelsText-Region Matching for Multi-Label Image Recognition with Missing
Labels