Dispelling The Mirage Of Progress In Offline MARL Through Standardised Baselines And Evaluation
2024 Β· Claude Formanek, Callum Rhys Tilbury, Louise Beyers, et al.
Abstract
Offline multi-agent reinforcement learning (MARL) is an emerging field with great promise for real-world applications. Unfortunately, the current state of research in offline MARL is plagued by inconsistencies in baselines and evaluation protocols, which ultimately makes it difficult to accurately assess progress, trust newly proposed innovations, and allow researchers to easily build upon prior work. In this paper, we firstly identify significant shortcomings in existing methodologies for measuring the performance of novel algorithms through a representative study of published offline MARL work. Secondly, by directly comparing to this prior work, we demonstrate that simple, well-implemented baselines can achieve state-of-the-art (SOTA) results across a wide range of tasks. Specifically, we show that on 35 out of 47 datasets used in prior work (almost 75% of cases), we match or surpass the performance of the current purported SOTA. Strikingly, our baselines often substantially outperfo
Authors
(none)
Tags
Stats
Related papers
- Off-the-grid MARL: Datasets With Baselines For Offline Multi-agent Reinforcement Learning (2023)2.26
- Towards A Standardised Performance Evaluation Protocol For Cooperative MARL (2022)0.00
- How Much Can Change In A Year? Revisiting Evaluation In Multi-agent Reinforcement Learning (2023)0.00
- Benchmarl: Benchmarking Multi-agent Reinforcement Learning (2023)5.58
- Benchmarking Multi-agent Deep Reinforcement Learning Algorithms In Cooperative Tasks (2020)0.00
- Hokoff: Real Game Dataset From Honor Of Kings And Its Offline Reinforcement Learning Benchmarks (2024)0.00
- Optimality Inductive Biases And Agnostic Guidelines For Offline Reinforcement Learning (2021)0.00
- Learning From Good Trajectories In Offline Multi-agent Reinforcement Learning (2022)5.24