Qwen-2.5-VL-7B
Emerging5papers using it
2025first seen
The 'Qwen2.5-VL-7B' is a benchmark used to evaluate the efficiency of multimodal large language models (MLLMs) in terms of visual token selection and pruning strategies.
Papers using Qwen-2.5-VL-7B (5)
- Stepwise Token Selection for Efficient Multimodal Large Language ModelsMuCRASP: Multimodal Chain-of-thought Reasoning aware Structured PruningSelf-Correction Inside the Model: Leveraging Layer Attention to Mitigate Hallucinations in Large Vision Language ModelsHALP: Detecting Hallucinations in Vision-Language Models without Generating a Single TokenToken-Level Inference-Time Alignment for Vision-Language Models