Pengfei Liu
26 papers · 1163 citations
Most-cited papers
- Deepresearcher: Scaling Deep Research Via Reinforcement Learning In Real-world Environments2025 · 192 citations
- Generative Judge For Evaluating Alignment2023 · 165 citations
- Infobench: Evaluating Instruction Following Ability In Large Language Models2024 · 109 citations
- Benchmarking Benchmark Leakage In Large Language Models2024 · 105 citations
- Let's Reward Step By Step: Step-level Reward Model As The Navigators For Reasoning2023 · 97 citations
- Projdevbench: Benchmarking AI Coding Agents On End-to-end Project Development2026
- UI-TARS-2 Technical Report: Advancing GUI Agent With Multi-turn Reinforcement Learning2025
- Innovatorbench: Evaluating Agents' Ability To Conduct Innovative LLM Research2025
Topics