Visual Genome
Canonical6papers using it
2,296HF downloads
83HF likes
2016first seen
Visual Genome enable to model objects and relationships between objects. They collect dense annotations of objects, attributes, and relationships within each image. Specifically, the dataset contains over 108K images where each image has an average of 35 objects, 26 attributes, and 21 pairwise relationships between objects.
π€ Hugging Faceβ cc-by-4.0
Papers using Visual Genome (6)
- Phrasecut: Language-based Image Segmentation In The WildTowards Open-vocabulary Scene Graph Generation With Prompt-based FinetuningZero-shot Object DetectionDense Captioning With Joint Inference And Visual ContextScene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship
DetectionZero-shot Object Detection Through Vision-language Embedding Alignment