COCO-2017
Emerging8papers using it
2024first seen
Papers using COCO-2017 (8)
- Can Multimodal Large Language Models Understand Spatial Relations?Context-Dependent Affordance Computation in Vision-Language ModelsEnhancing Open-Vocabulary Object Detection through Multi-Level Fine-Grained Visual-Language AlignmentCoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language TasksExtreme Model Compression For Edge Vision-language Models: Sparse Temporal Token Fusion And Adaptive Neural CompressionCaprecover: A Cross-modality Feature Inversion Attack Framework On Vision Language ModelsAn Enhanced Large Language Model For Cross Modal Query Understanding
System Using DL-KeyBERT Based CAZSSCL-MPGPTSJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection