BFCLv-3

Emerging

14papers using it

2025first seen

BFCLv-3 is a benchmark dataset used to evaluate the effectiveness of large language model agents in generating and refining tool calls through simulated feedback.

🔎 Find this dataset

Papers using BFCLv-3 (13)

D-CORE: Incentivizing Task Decomposition in Large Reasoning Models for Complex Tool Use2026

Pushing the Limits of LLM Tool Calling via Experiential Knowledge Integration and Activation2026

MAVEN: Improving Generalization in Agentic Tool Calling2026

TopoCurate:Modeling Interaction Topology for Tool-Use Agent Training2026

ToolWeave: Structured Synthesis of Complex Multi-Turn Tool-Calling Dialogues2026

HINT-SD: Targeted Hindsight Self-Distillation for Long-Horizon Agents2026

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL2026

Controllable and Verifiable Tool-Use Data Synthesis for Agentic Reinforcement Learning2026

MagicAgent: Towards Generalized Agent Planning2026

Gecko: A Simulation Environment with Stateful Feedback for Refining Agent Tool Calls2026

On Generalization in Agentic Tool Calling: CoreThink Agentic Reasoner and MAVEN Dataset2025

Small Language Models For Agentic Systems: A Survey Of Architectures, Capabilities, And Deployment Trade Offs2025

ToolSample: Dual Dynamic Sampling Methods with Curriculum Learning for RL-based Tool Learning2025