SkillBench

Name: SkillBench
License: apache-2.0

Emerging

3papers using it

17HF downloads

0HF likes

2026first seen

SkillBench is a challenging benchmark designed to evaluate an LLM's logical orchestration and cross-domain skill synthesis capabilities. Developed using the STEPS framework and synthesized via GPT-4.1, it moves beyond simple tool-calling to test how models solve complex, multi-step problems by integrating diverse vertical skills. Dataset Scale & Statistics: The dataset contains 545 high-quality, expert-validated samples. These are grounded in diverse seeds from Infinity-Instruct and… See the full description on the dataset page: https://huggingface.co/datasets/Weiyifan/SkillBench.

🤗 Hugging Face⚖ apache-2.0

Papers using SkillBench (1)

SkVM: Compiling Skills for Efficient Execution Everywhere2026