← authors · overview

Lin Qiu

6 papers · 4 citations

Most-cited papers

Automatically Benchmarking LLM Code Agents Through Agent-driven Annotation And Evaluation
2025
Catarena: Evaluating Evolutionary Capabilities Of Code Agents Via Iterative Tournaments
2025
Amemgym: Interactive Memory Benchmarking For Assistants In Long-horizon Conversations
2026

Topics

Evaluation Code Agents Benchmarks Multi-Agent Memory