← all datasets

Qwen-3 8B

Emerging
3papers using it
2026first seen

'Qwen3-8B' is a benchmark dataset used to evaluate the performance of post-training tool-using large language models (LLMs) by providing diverse, executable, and verifiable agentic tasks derived from real-world tool usage.

Papers using Qwen-3 8B (3)

Qwen-3 8B β€” datasets β€” ai-agents