Awesome Papers
LLMs
Quantum
SimSearch
AI4Code
Agents
CV
Robotics
Cyber
AI4Sci
Speech
RL
MM
GenAI
Graph
TS
RecSys
FL
☾
☀
← authors
·
overview
Yuxuan Zhu
4
papers ·
0
citations
Most-cited papers
Terminal-bench: Benchmarking Agents On Hard, Realistic Tasks In Command Line Interfaces
2026
Establishing Best Practices For Building Rigorous Agentic Benchmarks
2025
Top co-authors
Yue Liu
· 2
Bowen Wang
· 1
Chao Feng
· 1
Cheng Chen
· 1
Cheng Liu
· 1
Chuan Wen
· 1
Flood Sung
· 1
Hao Hu
· 1
Haowei Lin
· 1
Hao Zhang
· 1
Ion Stoica
· 1
Jiacheng Zhu
· 1
Topics
Benchmarks
Code Agents
🤖
Ask AI