COCO-2017

Name: COCO-2017
License: unknown

Emerging

8papers using it

20HF downloads

0HF likes

2024first seen

COCO 2017 image captions in Vietnamese The dataset is firstly introduced in dinhanhx/VisualRoBERTa. I use VinAI tools to translate COCO 2027 image caption (2017 Train/Val annotations) from English to Vietnamese. Then we merge UIT-ViIC dataset into it. To load the dataset, one can take a look at this code in VisualRoBER

🤗 Hugging Face⚖ unknown

Papers using COCO-2017 (8)

Can Multimodal Large Language Models Understand Spatial Relations?2025 · 1 cites

Context-Dependent Affordance Computation in Vision-Language Models2026

Enhancing Open-Vocabulary Object Detection through Multi-Level Fine-Grained Visual-Language Alignment2026

CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks2025

Extreme Model Compression For Edge Vision-language Models: Sparse Temporal Token Fusion And Adaptive Neural Compression2025

Caprecover: A Cross-modality Feature Inversion Attack Framework On Vision Language Models2025

SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection2024

An Enhanced Large Language Model For Cross Modal Query Understanding System Using DL-KeyBERT Based CAZSSCL-MPGPT2025