Qwen-2.5-VL-7B

Emerging

6papers using it

2025first seen

The 'Qwen-2.5-VL-7B' is a benchmark used to evaluate the performance of Vision-Language Models (VLMs) in terms of their alignment and ability to reduce hallucinations during inference.

🔎 Find this dataset

Papers using Qwen-2.5-VL-7B (6)

Stepwise Token Selection for Efficient Multimodal Large Language Models2026

GRIP: Feedback-Guided Prompt Retrieval for Large Multimodal Models2026

MuCRASP: Multimodal Chain-of-thought Reasoning aware Structured Pruning2026

Self-Correction Inside the Model: Leveraging Layer Attention to Mitigate Hallucinations in Large Vision Language Models2026

HALP: Detecting Hallucinations in Vision-Language Models without Generating a Single Token2026

Token-Level Inference-Time Alignment for Vision-Language Models2025