Smacv2: An Improved Benchmark For Cooperative Multi-agent Reinforcement Learning
2022 Β· Benjamin Ellis, Jonathan Cook, Skander Moalla, et al.
Abstract
The availability of challenging benchmarks has played a key role in the recent progress of machine learning. In cooperative multi-agent reinforcement learning, the StarCraft Multi-Agent Challenge (SMAC) has become a popular testbed for centralised training with decentralised execution. However, after years of sustained improvement on SMAC, algorithms now achieve near-perfect performance. In this work, we conduct new analysis demonstrating that SMAC lacks the stochasticity and partial observability to require complex *closed-loop* policies. In particular, we show that an *open-loop* policy conditioned only on the timestep can achieve non-trivial win rates for many SMAC scenarios. To address this limitation, we introduce SMACv2, a new version of the benchmark where scenarios are procedurally generated and require agents to generalise to previously unseen settings (from the same distribution) during evaluation. We also introduce the extended partial observability challenge (EPO), which au
Authors
(none)
Tags
Stats
Related papers
- Smac-hard: Enabling Mixed Opponent Strategy Script And Self-play On SMAC (2024)0.00
- Starcraft+: Benchmarking Multi-agent Algorithms In Adversary Paradigm (2025)0.00
- Transformer-based Value Function Decomposition For Cooperative Multi-agent Reinforcement Learning In Starcraft (2022)8.82
- Decomposed Soft Actor-critic Method For Cooperative Multi-agent Reinforcement Learning (2021)0.00
- Is Independent Learning All You Need In The Starcraft Multi-agent Challenge? (2020)0.00
- Attention-based Recurrence For Multi-agent Reinforcement Learning Under Stochastic Partial Observability (2023)0.00
- Population-based Evaluation In Repeated Rock-paper-scissors As A Benchmark For Multiagent Reinforcement Learning (2023)0.00
- Semi-on-policy Training For Sample Efficient Multi-agent Policy Gradients (2021)0.00