← authors · overview

Pengfei Liu

26 papers · 1163 citations

Most-cited papers

Deepresearcher: Scaling Deep Research Via Reinforcement Learning In Real-world Environments
2025 · 192 citations
Generative Judge For Evaluating Alignment
2023 · 165 citations
Infobench: Evaluating Instruction Following Ability In Large Language Models
2024 · 109 citations
Benchmarking Benchmark Leakage In Large Language Models
2024 · 105 citations
Let's Reward Step By Step: Step-level Reward Model As The Navigators For Reasoning
2023 · 97 citations
Projdevbench: Benchmarking AI Coding Agents On End-to-end Project Development
2026
UI-TARS-2 Technical Report: Advancing GUI Agent With Multi-turn Reinforcement Learning
2025
Innovatorbench: Evaluating Agents' Ability To Conduct Innovative LLM Research
2025

Topics

Evaluation Reinforcement Learning Prompting Training Techniques Code Agents RAG In-Context Learning Safety & Alignment Multi-Agent Memory