Awesome Papers

Papers

R-Zero: Self-Evolving Reasoning LLM from Zero Data (2025)
Chengsong Huang et al.
19.83
Agentic Reinforced Policy Optimization (2025)
Guanting Dong et al.
19.61
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning (2025)
Huatong Song et al.
18.73
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning (2025)
Haozhan Li et al.
18.40
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL (2025)
Weizhen Li et al.
18.23
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners (2025)
Yuhang Liu et al.
16.27
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning (2025)
Zhongwei Wan et al.
16.17
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning (2025)
Fanqi Wan et al.
16.01
EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments (2026)
Jundong Xu et al.
16.01
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning (2026)
Seokju Cho et al.
15.70
MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent (2025)
Hongli Yu et al.
15.38
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security (2026)
Dongrui Liu et al.
14.78
APPO: Agentic Procedural Policy Optimization (2026)
Xucong Wang et al.
14.60
FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents (2026)
Jia Deng et al.
14.20
Kimi K2.5: Visual Agentic Intelligence (2026)
Kimi Team: Tongtong Bai et al.
13.84
Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution (2026)
Xucong Wang et al.
13.67
LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories (2026)
Baochang Ren et al.
13.59
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration (2026)
Jiaqi Liu et al.
13.52
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks (2026)
Mengyu Zheng et al.
13.33
ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time? (2026)
Woojung Song et al.
13.24
Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application (2026)
Jiachun Li et al.
13.22
Heterogeneous Agent Collaborative Reinforcement Learning (2026)
Zhixia Zhang et al.
13.18
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research (2026)
Dingbang Wu et al.
12.91
AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints (2026)
Jiayu Liu et al.
12.83
AWorld: Orchestrating the Training Recipe for Agentic AI (2025)
Chengyue Yu et al.
12.78
SWE-Explore: Benchmarking How Coding Agents Explore Repositories (2026)
Shaoqiu Zhang et al.
12.73
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models (2025)
DeepSeek-AI et al.
12.70
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation (2026)
Tianyi Zhou et al.
12.70
K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts (2026)
Nahyun Lee et al.
12.62
SkillOpt: Executive Strategy for Self-Evolving Agent Skills (2026)
Yifan Yang et al.
12.58
From Chatbot to Digital Colleague: The Paradigm Shift Toward Persistent Autonomous AI (2026)
Yongheng Zhang et al.
12.47
LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning (2026)
Yifan Dai et al.
12.42
ResearchMath-14K: Scaling Research-Level Mathematics via Agents (2026)
Guijin Son et al.
12.37
Planning, Creation, Usage: Benchmarking Llms For Comprehensive Tool Utilization In Real-world Complex Scenarios (2024)
Shijue Huang, Wanjun Zhong, Jianqiao Lu, et al.
12.31
LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards (2026)
Nianyi Lin et al.
12.21
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation (2026)
Yibo Wang et al.
12.15
Self-Improving Language Models with Bidirectional Evolutionary Search (2026)
Guowei Xu et al.
12.10
GrepSeek: Training Search Agents for Direct Corpus Interaction (2026)
Alireza Salemi et al.
12.06
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents (2026)
Suji Kim et al.
12.03
HarnessX: A Composable, Adaptive, and Evolvable Agent Harness Foundry (2026)
Tingyang Chen et al.
11.97
Agent models: Internalizing Chain-of-Action Generation into Reasoning models (2025)
Yuxiang Zhang et al.
11.87
Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution (2025)
Tianrui Qin et al.
11.85
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents (2025)
Kunlun Zhu et al.
11.84
QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks (2026)
Jian Xie et al.
11.71
World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning (2026)
Yucheng Zhou et al.
11.70
Latent Collaboration in Multi-Agent Systems (2025)
Jiaru Zou et al.
11.66
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs (2026)
Sangwoo Park et al.
11.64
Relay Hindsight Experience Replay: Self-guided Continual Reinforcement Learning For Sequential Object Manipulation Tasks With Sparse Rewards (2022)
Yongle Luo, Yuxin Wang, Kun Dong, et al.
11.58
Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning (2025)
Haoran Luo et al.
11.55
EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery (2026)
Amy Xin et al.
11.55