SEED-Bench

Canonical

9papers using it

2023first seen

SEED-Bench Card Benchmark details Benchmark type: SEED-Bench is a large-scale benchmark to evaluate Multimodal Large Language Models (MLLMs). It consists of 19K multiple choice questions with accurate human annotations, which covers 12 evaluation dimensions including the comprehension of both the image and video modali

🔎 Find this dataset

Papers using SEED-Bench (9)

Make Your LVLM KV Cache More Lightweight2026

LLMind: Bio-inspired Training-free Adaptive Visual Representations for Vision-Language Models2026

Same Answer, Different Representations: Hidden instability in VLMs2026

Vision to Geometry: 3D Spatial Memory for Sequential Embodied MLLM Reasoning and Exploration2025

HybridToken-VLM: Hybrid Token Compression for Vision-Language Models2025

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition2023 · 31 cites

EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model2024 · 1 cites

Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping2024

Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models2024