← all datasets

ToolBench

Emerging
3papers using it
2,806HF downloads
1HF likes
2024first seen

ToolBench is a benchmark that evaluates the ability of large language models to utilize external tools through a stable and large-scale framework, incorporating a virtual API server and a systematic evaluation approach.

Papers using ToolBench (3)

ToolBench β€” datasets β€” ai-for-code