Awesome Multi-Agent
Multi-Agent is one of the most active areas in Awesome AI Agents β 4,019 papers in this collection, evaluated on datasets like GAIA, ALFWorld, SMAC. A strong starting point is "R-Zero: Self-Evolving Reasoning LLM from Zero Data".
Datasets & benchmarks
Key papers
- R-Zero: Self-Evolving Reasoning LLM from Zero Data (2025)Chengsong Huang et al.19.83
- Agentic Reinforced Policy Optimization (2025)Guanting Dong et al.19.61
- Autogen Studio: A No-code Developer Tool For Building And Debugging Multi-agent Systems (2024)Victor Dibia, Jingya Chen, Gagan Bansal, et al.19.59
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL (2025)Weizhen Li et al.18.23
- InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to
Deliberative Reasoners (2025)Yuhang Liu et al.16.27
- Mapcoder: Multi-agent Code Generation For Competitive Problem Solving (2024)Md. Ashraful Islam, Mohammed Eunus Ali, Md Rizwan Parvez15.96
- MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent (2025)Hongli Yu et al.15.38
- Dynamic Multi-robot Task Allocation Under Uncertainty And Temporal Constraints (2020)Shushman Choudhury, Jayesh K. Gupta, Mykel J. Kochenderfer, et al.15.28
- MAPPER: Multi-agent Path Planning With Evolutionary Reinforcement Learning In Mixed Dynamic Environments (2020)Zuxin Liu, Baiming Chen, Hongyi Zhou, et al.15.19
- A Multi-agent Reinforcement Learning Approach For Efficient Client Selection In Federated Learning (2022)Sai Qian Zhang, Jieyu Lin, Qi Zhang15.00
- AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security (2026)Dongrui Liu et al.14.78
- COVINS: Visual-inertial SLAM For Centralized Collaboration (2021)Patrik Schmuck, Thomas Ziegler, Marco Karrer, et al.14.66
- VMAS: A Vectorized Multi-agent Simulator For Collective Robot Learning (2022)Matteo Bettini, Ryan Kortvelesy, Jan Blumenkamp, et al.14.32
- Deep Research Agents: A Systematic Examination And Roadmap (2025)Yuxuan Huang et al.14.02
- Kimi K2.5: Visual Agentic Intelligence (2026)Kimi Team: Tongtong Bai et al.13.84
- Agent Explorative Policy Optimization for Multimodal Agentic Reasoning (2026)Minki Kang et al.13.75
- Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution (2026)Xucong Wang et al.13.67
- AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration (2026)Jiaqi Liu et al.13.52
- Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks (2026)Mengyu Zheng et al.13.33
- Learning Agent Communication Under Limited Bandwidth By Message Pruning (2019)Hangyu Mao, Zhengchao Zhang, Zhen Xiao, et al.13.23
- Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application (2026)Jiachun Li et al.13.22
- Heterogeneous Agent Collaborative Reinforcement Learning (2026)Zhixia Zhang et al.13.18
- ACC: Compiling Agent Trajectories for Long-Context Training (2026)Qisheng Su et al.13.06
- AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints (2026)Jiayu Liu et al.12.83
- AWorld: Orchestrating the Training Recipe for Agentic AI (2025)Chengyue Yu et al.12.78
- DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models (2025)DeepSeek-AI et al.12.70
- ResearchMath-14K: Scaling Research-Level Mathematics via Agents (2026)Guijin Son et al.12.37
- Streaming Communication in Multi-Agent Reasoning (2026)Zhen Yang et al.12.32
- Orchestra-o1: Omnimodal Agent Orchestration (2026)Fan Zhang et al.12.21
- DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation (2026)Yibo Wang et al.12.15
- Lect\=uraAgents: A Multi-Agent Framework for Adaptive Personalized AI-Assisted Learning and Embodied Teaching (2026)Jaward Sesay et al.12.03
- HarnessX: A Composable, Adaptive, and Evolvable Agent Harness Foundry (2026)Tingyang Chen et al.11.97
- Agent models: Internalizing Chain-of-Action Generation into Reasoning
models (2025)Yuxiang Zhang et al.11.87
- CHI-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows? (2026)Haolin Chen et al.11.86
- Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution (2025)Tianrui Qin et al.11.85
- MultiAgentBench: Evaluating the Collaboration and Competition of LLM
agents (2025)Kunlun Zhu et al.11.84
- SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research (2026)Pu Ning et al.11.78
- WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning (2026)Zelai Xu et al.11.70
- Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models (2026)Heecheol Yun et al.11.70
- Latent Collaboration in Multi-Agent Systems (2025)Jiaru Zou et al.11.66
- It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs (2026)Sangwoo Park et al.11.64
- Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning (2025)Haoran Luo et al.11.55
- EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery (2026)Amy Xin et al.11.55
- Personal AI Agent for Camera Roll VQA (2026)Thao Nguyen et al.11.45
- A Cordial Sync: Going Beyond Marginal Policies For Multi-agent Embodied Tasks (2020)Unnat Jain, Luca Weihs, Eric Kolve, et al.11.39
- $\pi$-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows (2026)Haoran Zhang et al.11.23
- VisualClaw: A Real-Time, Personalized Agent for the Physical World (2026)Haoqin Tu et al.11.22
- GIS Copilot: Towards An Autonomous GIS Agent For Spatial Analysis (2024)Temitope Akinboyewa, Zhenlong Li, Huan Ning, et al.11.08
- MARS: Modular Agent with Reflective Search for Automated AI Research (2026)Jiefeng Chen et al.10.85
- Dif-maml: Decentralized Multi-agent Meta-learning (2020)Mert Kayaalp, Stefan Vlaski, Ali H. Sayed10.85
- Fast Decomposition Of Temporal Logic Specifications For Heterogeneous Teams (2020)Kevin Leahy, Austin Jones, Cristian-Ioan Vasile10.85
- Qwen3-Coder-Next Technical Report (2026)Ruisheng Cao et al.10.84
- GUI-CIDER: Mid-training GUI Agents via Causal Internalization and Density-aware Exemplar Reselection (2026)Zheng Wu et al.10.77
- Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs (2026)Haiquan Lu et al.10.69
- Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning (2025)Guanting Dong et al.10.67
- TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning (2026)Heming Zou et al.10.60
- Prompt Injection Attacks in Large Language Models and AI Agent Systems: A Comprehensive Review of Vulnerabilities, Attack Vectors, and Defense Mechanisms (2026)Saidakhror Gulyamov et al.10.60
- ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas (2026)Xiaoyu Tian and Haotian Wang and Shuaiting Chen and Hao Zhou and Kaichi Yu and Yudian Zhang and Jade Ouyang and Junxi Yin and Jiong Chen and Baoyan Guo and Lei Zhang and Junjie Tao and Yuansheng Song and Ming Cui and Chengwei Liu10.59
- OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation (2026)Guibin Zhang et al.10.59
- CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery (2026)Ao Qu et al.10.54