Qwen-2.5-VL

Emerging

9papers using it

2025first seen

The 'Qwen-2.5-VL' dataset/benchmark is used to evaluate hallucination detection in vision-language models by analyzing their internal representations before any text generation occurs.

🔎 Find this dataset

Papers using Qwen-2.5-VL (9)

Curvature-Guided Mixing for MLLM Adaptation2026

HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models2026

Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models2026

HALP: Detecting Hallucinations in Vision-Language Models without Generating a Single Token2026

Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity2026

Medvlthinker: Simple Baselines For Multimodal Medical Reasoning2025

MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models2025

Mitigating Hallucinations via Inter-Layer Consistency Aggregation in Large Vision-Language Models2025

Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision2025