Qwen-2-VL
Emerging3papers using it
2025first seen
The 'Qwen2-VL' is a benchmark dataset used to evaluate complex visual reasoning capabilities in multi-modal models, containing high-quality samples with multi-region annotations and structured reasoning chains.
Papers using Qwen-2-VL (3)
- Watch Wider and Think Deeper: Collaborative Cross-modal Chain-of-Thought for Complex Visual ReasoningChain-of-Thought Compression Should Not Be Blind: V-Skip for Efficient Multimodal Reasoning via Dual-Path AnchoringSimulated Ensemble Attack: Transferring Jailbreaks Across Fine-tuned Vision-language Models