BFCL
Emerging3papers using it
2026first seen
The 'BFCL' dataset/benchmark contains 400 function-calling tasks used to evaluate operational metrics such as determinism, reliability, security, and cost in the context of LLM-based code generation for enterprise workflows.