Ο-Bench
Emerging9papers using it
2025first seen
Papers using Ο-Bench (9)
- Goal Alignment in LLM-Based User Simulators for Conversational AIScaling Agentic Capabilities via Grounded Interaction SynthesisUncertainty-Aware Clarification in LLM Agents with Information GainWhen Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use BehaviorsReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence ControlProtean Compiler: An Agile Framework to Drive Fine-grain Phase OrderingScaleEnv: Scaling Environment Synthesis from Scratch for Generalist Interactive Tool-Use Agent TrainingSearch More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and GeneralizationImpatient Users Confuse AI Agents: High-fidelity Simulations of Human Traits for Testing Agents