Higher Replay Ratio Empowers Sample-efficient Multi-agent Reinforcement Learning
2024 Β· Linjie Xu, Zichuan Liu, Alexander Dockhorn, et al.
Abstract
One of the notorious issues for Reinforcement Learning (RL) is poor sample efficiency. Compared to single agent RL, the sample efficiency for Multi-Agent Reinforcement Learning (MARL) is more challenging because of its inherent partial observability, non-stationary training, and enormous strategy space. Although much effort has been devoted to developing new methods and enhancing sample efficiency, we look at the widely used episodic training mechanism. In each training step, tens of frames are collected, but only one gradient step is made. We argue that this episodic training could be a source of poor sample efficiency. To better exploit the data already collected, we propose to increase the frequency of the gradient updates per environment interaction (a.k.a. Replay Ratio or Update-To-Data ratio). To show its generality, we evaluate \(3\) MARL methods on \(6\) SMAC tasks. The empirical results validate that a higher replay ratio significantly improves the sample efficiency for MARL a
Authors
(none)
Tags
Stats
Related papers
- Stabilising Experience Replay For Deep Multi-agent Reinforcement Learning (2017)0.00
- Novelty-guided Data Reuse For Efficient And Diversified Multi-agent Reinforcement Learning (2024)2.26
- MAC-PO: Multi-agent Experience Replay Via Collective Priority Optimization (2023)0.00
- Prioritized Guidance For Efficient Multi-agent Reinforcement Learning Exploration (2019)0.00
- Provably Efficient Information-directed Sampling Algorithms For Multi-agent Reinforcement Learning (2024)2.26
- Characterizing Speed Performance Of Multi-agent Reinforcement Learning (2023)4.52
- Efficient Episodic Memory Utilization Of Cooperative Multi-agent Reinforcement Learning (2024)0.00
- Accmer: Accelerating Multi-agent Experience Replay With Cache Locality-aware Prioritization (2023)5.24