Awesome AI Agents

📄Papers 🧭Topics 🔥Trending 🗺️Map 🏆Leaderboards 🎓Learn 🤖Ask AI

⋯More

👥Authors 📚Reading Packs 📊Datasets 🛠️Tools 📰News 📝Blogs ✉️Newsletter 🎯Research Radar 🔖Saved

← all topics overview

Planning

loading…

Stay Updated

E-Mail Digest 🎯 Research Radar

Submit a paper · Privacy · Terms

© 2026 Awesome Papers.

Awesome Planning — curated papers, datasets & benchmarks · Awesome AI Agents

← all topics overview

Awesome Planning

Planning is one of the most active areas in Awesome AI Agents — 2,087 papers in this collection, evaluated on datasets like ALFWorld, WebShop, GAIA. A strong starting point is "The Regretful Agent: Heuristic-aided Navigation Through Progress Estimation".

Datasets & benchmarks

ALFWorld48 papers

WebShop20 papers

ScienceWorld13 papers

TravelPlanner9 papers

BrowseComp8 papers

Minecraft8 papers

WebArena7 papers

VirtualHome7 papers

OSWorld6 papers

HotpotQA6 papers

Key papers

60 papers · trending (default)numbers = 🔥 heat

The Regretful Agent: Heuristic-aided Navigation Through Progress Estimation (2019)
Chih-Yao Ma, Zuxuan Wu, Ghassan Alregib, et al.
19.27
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning (2025)
Haozhan Li et al.
18.29
ABot-World-0: Infinite Interactive World Rollout on a Single Desktop GPU (2026)
Fan Jiang et al.
17.67
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners (2025)
Yuhang Liu et al.
16.16
Harness Handbook: Making Evolving Agent Harnesses Readable,Navigable, and Editable (2026)
Ruhan Wang et al.
15.69
MAPPER: Multi-agent Path Planning With Evolutionary Reinforcement Learning In Mixed Dynamic Environments (2020)
Zuxin Liu, Baiming Chen, Hongyi Zhou, et al.
15.19
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments (2026)
Qiuyue Wang et al.
14.99
Qwen-AgentWorld: Language World Models for General Agents (2026)
Yuxin Zuo et al.
14.85
APPO: Agentic Procedural Policy Optimization (2026)
Xucong Wang et al.
14.49
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models (2025)
DeepSeek-AI et al.
14.19
Deep Research Agents: A Systematic Examination And Roadmap (2025)
Yuxuan Huang et al.
13.91
ABot-N1: Toward a General Visual Language Navigation Foundation Model (2026)
Ruiyan Gong et al.
13.91
LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories (2026)
Baochang Ren et al.
13.47
Llmind: Orchestrating AI And Iot With LLM For Complex Task Execution (2023)
Hongwei Cui, Yuyang Du, Qun Yang, et al.
13.39
AgenticSTS: A Bounded-Memory Testbed for Long-Horizon LLM Agents (2026)
Xiangchen Cheng et al.
12.98
Progress Reward Modeling for Robotic Learning: A Comprehensive Survey (2026)
Jianshu Zhang et al.
12.88
AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints (2026)
Jiayu Liu et al.
12.72
Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent (2026)
Lei Bai et al.
12.68
From Chatbot to Digital Colleague: The Paradigm Shift Toward Persistent Autonomous AI (2026)
Yongheng Zhang et al.
12.64
Understanding The Weakness Of Large Language Model Agents Within A Complex Android Environment (2024)
Mingzhe Xing, Rongkai Zhang, Hui Xue, et al.
12.57
Saynav: Grounding Large Language Models For Dynamic Planning To Navigation In New Environments (2023)
Abhinav Rajvanshi, Karan Sikka, Xiao Lin, et al.
12.54
Planning, Creation, Usage: Benchmarking Llms For Comprehensive Tool Utilization In Real-world Complex Scenarios (2024)
Shijue Huang, Wanjun Zhong, Jianqiao Lu, et al.
12.31
Decentralized Cooperative Planning For Automated Vehicles With Hierarchical Monte Carlo Tree Search (2018)
Karl Kurzer, Chenyang Zhou, J. Marius Zöllner
12.25
OpenCoF: Learning to Reason Through Video Generation (2026)
Xinyan Chen et al.
12.17
KnowAct-GUIClaw: Know Deeply, Act Perfectly, Personal GUI Assistant with Self-Evolving Memory and Skill (2026)
Yunxin Li et al.
12.06
Self-Improving Language Models with Bidirectional Evolutionary Search (2026)
Guowei Xu et al.
11.99
GrepSeek: Training Search Agents for Direct Corpus Interaction (2026)
Alireza Salemi et al.
11.95
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents (2026)
Suji Kim et al.
11.92
VideoSearch-R1: Iterative Video Retrieval and Reasoning via Soft Query Refinement (2026)
Seohyun Lee et al.
11.77
Agent models: Internalizing Chain-of-Action Generation into Reasoning models (2025)
Yuxiang Zhang et al.
11.76
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents (2025)
Kunlun Zhu et al.
11.73
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking (2026)
Qiang Zhang et al.
11.73
SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research (2026)
Pu Ning et al.
11.67
Playful Agentic Robot Learning (2026)
Junyi Zhang et al.
11.62
Relay Hindsight Experience Replay: Self-guided Continual Reinforcement Learning For Sequential Object Manipulation Tasks With Sparse Rewards (2022)
Yongle Luo, Yuxin Wang, Kun Dong, et al.
11.58
Active Inference And Behavior Trees For Reactive Action Planning And Execution In Robotics (2020)
Corrado Pezzato, Carlos Hernandez Corbato, Stefan Bonhof, et al.
11.49
EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery (2026)
Amy Xin et al.
11.44
Guava: An Effective and Universal Harness for Embodied Manipulation (2026)
Haowen Liu et al.
11.44
SkillRise: Agentic Reinforcement Learning for Cross-Task Skill Evolution (2026)
Zhiyuan Yao et al.
11.33
Long-Horizon-Terminal-Bench: Testing the Limits of Agents on Long-Horizon Terminal Tasks with Dense Reward-Based Grading (2026)
Zongxia Li et al.
10.89
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers (2025)
Zhenting Wang et al.
10.85
Fast Decomposition Of Temporal Logic Specifications For Heterogeneous Teams (2020)
Kevin Leahy, Austin Jones, Cristian-Ioan Vasile
10.85
Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields (2026)
Liya Zhu et al.
10.82
Claw-Anything: Benchmarking Always-On Personal Assistants with Broader Access to User's Digital World (2026)
Yusong Lin et al.
10.76
Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents (2026)
Ziyan Liu et al.
10.68
GUI-CIDER: Mid-training GUI Agents via Causal Internalization and Density-aware Exemplar Reselection (2026)
Zheng Wu and Chengcheng Han and Zhengxi Lu and Tianjie Ju and Yanyu Chen and Qi Gu and Xunliang Cai and Zhuosheng Zhang
10.66
ASPIRE: Agentic /Skills Discovery for Robotics (2026)
Runyu Lu et al.
10.66
Rethinking Continual Experience Internalization for Self-Evolving LLM Agents (2026)
Jingwen Chen et al.
10.60
MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery (2026)
Shangheng Du et al.
10.60
Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs (2026)
Haiquan Lu et al.
10.58
The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence (2026)
Aili Chen et al.
10.58
PhotoFlow: Agentic 3D Virtual Photography Missions (2026)
Jiarui Guo et al.
10.55
SkillCoach: Self-Evolving Rubrics for Evaluating and Enhancing Agentic Skill-Use (2026)
Jiayin Zhu et al.
10.54
TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning (2026)
Heming Zou et al.
10.49
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas (2026)
Xiaoyu Tian and Haotian Wang and Shuaiting Chen and Hao Zhou and Kaichi Yu and Yudian Zhang and Jade Ouyang and Junxi Yin and Jiong Chen and Baoyan Guo and Lei Zhang and Junjie Tao and Yuansheng Song and Ming Cui and Chengwei Liu
10.48
DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation (2026)
Jusuk Lee et al.
10.45
ShadowDancer: Teaching Video World Models Any Action by Learning Unified Dynamics Representations from a Video and Its Shadow (2026)
Jin Cao et al.
10.42
MapAgent: An Industrial-Grade Agentic Framework for City-scale Lane-level Map Generation (2026)
Deguo Xia et al.
10.37
Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning (2026)
NVIDIA (Allan) et al.
10.33
Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning (2026)
Jiapeng Zhu et al.
10.31