BFCL

Emerging

3papers using it

2026first seen

The 'BFCL' dataset/benchmark contains 400 function-calling tasks used to evaluate operational metrics such as determinism, reliability, security, and cost in the context of LLM-based code generation for enterprise workflows.

🔎 Find this dataset

Papers using BFCL (3)

ParaTool: Shifting Tool Representations from Context to Parameters2026

The Cold-Start Safety Gap in LLM Agents2026

Compiled AI: Deterministic Code Generation for LLM-Based Workflow Automation2026