BFCLv-3
Emerging7papers using it
2025first seen
The 'BFCLv-3' dataset/benchmark is used to evaluate the effectiveness of tool-use agents by providing a structured set of tasks that capture interaction dynamics and error recovery in their performance.
Papers using BFCLv-3 (7)
- TIER: Trajectory-Invariant Execution Rewards for Multi-Step Tool CompositionHINT-SD: Targeted Hindsight Self-Distillation for Long-Horizon AgentsControllable and Verifiable Tool-Use Data Synthesis for Agentic Reinforcement LearningTopoCurate:Modeling Interaction Topology for Tool-Use Agent TrainingMagicAgent: Towards Generalized Agent PlanningToolSample: Dual Dynamic Sampling Methods with Curriculum Learning for RL-based Tool LearningShiQ: Bringing back Bellman to LLMs