Dealing With Non-stationarity In Decentralized Cooperative Multi-agent Deep Reinforcement Learning Via Multi-timescale Learning
2023 Β· Hadi Nekoei, Akilesh Badrinaaraayanan, Amit Sinha, et al.
Abstract
Decentralized cooperative multi-agent deep reinforcement learning (MARL) can be a versatile learning framework, particularly in scenarios where centralized training is either not possible or not practical. One of the critical challenges in decentralized deep MARL is the non-stationarity of the learning environment when multiple agents are learning concurrently. A commonly used and efficient scheme for decentralized MARL is independent learning in which agents concurrently update their policies independently of each other. We first show that independent learning does not always converge, while sequential learning where agents update their policies one after another in a sequence is guaranteed to converge to an agent-by-agent optimal solution. In sequential learning, when one agent updates its policy, all other agent's policies are kept fixed, alleviating the challenge of non-stationarity due to simultaneous updates in other agents' policies. However, it can be slow because only one agen
Authors
(none)
Tags
Stats
Related papers
- Non-stationary Policy Learning For Multi-timescale Multi-agent Reinforcement Learning (2023)5.24
- Hierarchical Deep Multiagent Reinforcement Learning With Temporal Abstraction (2018)0.00
- Unsynchronized Decentralized Q-learning: Two Timescale Analysis By Persistence (2023)2.26
- Dealing With Non-stationarity In MARL Via Trust-region Decomposition (2021)0.00
- Multi-agent Reinforcement Learning In Stochastic Networked Systems (2020)0.00
- Transferable Multi-agent Reinforcement Learning With Dynamic Participating Agents (2022)0.00
- MA2QL: A Minimalist Approach To Fully Decentralized Multi-agent Reinforcement Learning (2022)0.00
- Mean-field Multi-agent Reinforcement Learning: A Decentralized Network Approach (2021)0.00