Awesome Planning
Planning is one of the most active areas in Awesome AI Agents β 2,458 papers in this collection, evaluated on datasets like ALFWorld, WebShop, LIBERO. A strong starting point is "The Regretful Agent: Heuristic-aided Navigation Through Progress Estimation".
Datasets & benchmarks
Key papers
- The Regretful Agent: Heuristic-aided Navigation Through Progress Estimation (2019)Chih-Yao Ma, Zuxuan Wu, Ghassan Alregib, et al.19.27
- Mobile Robot Path Planning In Dynamic Environments Through Globally Guided Reinforcement Learning (2020)Binyu Wang, Zhe Liu, Qingbiao Li, et al.18.85
- SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning (2025)Haozhan Li et al.18.40
- Agenttuning: Enabling Generalized Agent Abilities For Llms (2023)Aohan Zeng, Mingdao Liu, Rui Lu, et al.16.41
- InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to
Deliberative Reasoners (2025)Yuhang Liu et al.16.27
- MAPPER: Multi-agent Path Planning With Evolutionary Reinforcement Learning In Mixed Dynamic Environments (2020)Zuxin Liu, Baiming Chen, Hongyi Zhou, et al.15.19
- Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments (2026)Qiuyue Wang et al.15.10
- APPO: Agentic Procedural Policy Optimization (2026)Xucong Wang et al.14.60
- Deep Research Agents: A Systematic Examination And Roadmap (2025)Yuxuan Huang et al.14.02
- LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories (2026)Baochang Ren et al.13.59
- Llmind: Orchestrating AI And Iot With LLM For Complex Task Execution (2023)Hongwei Cui, Yuyang Du, Qun Yang, et al.13.39
- AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints (2026)Jiayu Liu et al.12.83
- From Chatbot to Digital Colleague: The Paradigm Shift Toward Persistent Autonomous AI (2026)Yongheng Zhang et al.12.75
- DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models (2025)DeepSeek-AI et al.12.70
- Understanding The Weakness Of Large Language Model Agents Within A Complex Android Environment (2024)Mingzhe Xing, Rongkai Zhang, Hui Xue, et al.12.57
- Saynav: Grounding Large Language Models For Dynamic Planning To Navigation In New Environments (2023)Abhinav Rajvanshi, Karan Sikka, Xiao Lin, et al.12.54
- Planning, Creation, Usage: Benchmarking Llms For Comprehensive Tool Utilization In Real-world Complex Scenarios (2024)Shijue Huang, Wanjun Zhong, Jianqiao Lu, et al.12.31
- Decentralized Cooperative Planning For Automated Vehicles With Hierarchical Monte Carlo Tree Search (2018)Karl Kurzer, Chenyang Zhou, J. Marius ZΓΆllner12.25
- DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation (2026)Yibo Wang et al.12.15
- Self-Improving Language Models with Bidirectional Evolutionary Search (2026)Guowei Xu et al.12.10
- GrepSeek: Training Search Agents for Direct Corpus Interaction (2026)Alireza Salemi et al.12.06
- Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents (2026)Suji Kim et al.12.03
- Agent models: Internalizing Chain-of-Action Generation into Reasoning
models (2025)Yuxiang Zhang et al.11.87
- Mapgpt: Map-guided Prompting With Adaptive Path Planning For Vision-and-language Navigation (2024)Jiaqi Chen, Bingqian Lin, Ran Xu, et al.11.85
- MultiAgentBench: Evaluating the Collaboration and Competition of LLM
agents (2025)Kunlun Zhu et al.11.84
- SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research (2026)Pu Ning et al.11.78
- Geometric Action Model for Robot Policy Learning (2026)Jisang Han et al.11.74
- World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning (2026)Yucheng Zhou et al.11.70
- Relay Hindsight Experience Replay: Self-guided Continual Reinforcement Learning For Sequential Object Manipulation Tasks With Sparse Rewards (2022)Yongle Luo, Yuxin Wang, Kun Dong, et al.11.58
- Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning (2025)Haoran Luo et al.11.55
- EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery (2026)Amy Xin et al.11.55
- Active Inference And Behavior Trees For Reactive Action Planning And Execution In Robotics (2020)Corrado Pezzato, Carlos Hernandez Corbato, Stefan Bonhof, et al.11.49
- Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack (2026)He Zhang et al.11.23
- SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning (2026)Peng Xia et al.11.12
- Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields (2026)Liya Zhu et al.10.93
- Fast Decomposition Of Temporal Logic Specifications For Heterogeneous Teams (2020)Kevin Leahy, Austin Jones, Cristian-Ioan Vasile10.85
- GUI-CIDER: Mid-training GUI Agents via Causal Internalization and Density-aware Exemplar Reselection (2026)Zheng Wu et al.10.77
- Rethinking Continual Experience Internalization for Self-Evolving LLM Agents (2026)Jingwen Chen et al.10.72
- Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs (2026)Haiquan Lu et al.10.69
- The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence (2026)MiniMax et al.10.69
- Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning (2025)Guanting Dong et al.10.67
- PhotoFlow: Agentic 3D Virtual Photography Missions (2026)Jiarui Guo et al.10.66
- TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning (2026)Heming Zou et al.10.60
- ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas (2026)Xiaoyu Tian and Haotian Wang and Shuaiting Chen and Hao Zhou and Kaichi Yu and Yudian Zhang and Jade Ouyang and Junxi Yin and Jiong Chen and Baoyan Guo and Lei Zhang and Junjie Tao and Yuansheng Song and Ming Cui and Chengwei Liu10.59
- DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation (2026)Jusuk Lee et al.10.56
- NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation (2026)NVIDIA et al.10.48
- MapAgent: An Industrial-Grade Agentic Framework for City-scale Lane-level Map Generation (2026)Deguo Xia et al.10.48
- Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning (2026)NVIDIA (Allan) et al.10.44
- Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning (2026)Jiapeng Zhu et al.10.42
- Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning (2026)Chi-Pin Huang et al.10.38
- Goal Alignment in LLM-Based User Simulators for Conversational AI (2025)Shuhaib Mehri et al.10.35
- Lect\=uraAgents: A Multi-Agent Framework for Adaptive Personalized AI-Assisted Learning and Embodied Teaching (2026)Jaward Sesay et al.10.35
- ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking (2026)Qiang Zhang et al.10.34
- MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers (2025)Zhenting Wang et al.10.30
- User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale (2026)Jungho Cho et al.10.30
- AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing (2026)Jisong Cai et al.10.21
- LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis (2026)Kewei Xu et al.10.15
- EVA: Efficient Reinforcement Learning for End-to-End Video Agent (2026)Yao Zhang et al.10.02
- Doremi: Grounding Language Model By Detecting And Recovering From Plan-execution Misalignment (2023)Yanjiang Guo, Yen-Jen Wang, Lihan Zha, et al.9.92
- Improving Lacam For Scalable Eventually Optimal Multi-agent Pathfinding (2023)Keisuke Okumura9.92