Qwen-3 8B
Emerging3papers using it
2026first seen
'Qwen3-8B' is a benchmark dataset used to evaluate the performance of post-training tool-using large language models (LLMs) by providing diverse, executable, and verifiable agentic tasks derived from real-world tool usage.