EIFBENCH
Emerging4papers using it
2025first seen
EIFBENCH is a benchmark designed to evaluate large language models on their ability to follow extremely complex instructions in multi-task scenarios with various constraints, reflecting real-world operational environments.
Papers using EIFBENCH (4)
- FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM PipelinesEIBench: A Simulator-Based Benchmark and Turn-Credit RL for Emotion ManagementAsk, Don't Judge: Binary Questions for Interpretable LLM Evaluation and Self-ImprovementEIFBENCH: Extremely Complex Instruction Following Benchmark for Large Language Models