CTDS: Centralized Teacher With Decentralized Student For Multi-agent Reinforcement Learning
2022 Β· Jian Zhao, Xunhan Hu, Mingyu Yang, et al.
Abstract
Due to the partial observability and communication constraints in many multi-agent reinforcement learning (MARL) tasks, centralized training with decentralized execution (CTDE) has become one of the most widely used MARL paradigms. In CTDE, centralized information is dedicated to learning the allocation of the team reward with a mixing network, while the learning of individual Q-values is usually based on local observations. The insufficient utility of global observation will degrade performance in challenging environments. To this end, this work proposes a novel Centralized Teacher with Decentralized Student (CTDS) framework, which consists of a teacher model and a student model. Specifically, the teacher model allocates the team reward by learning individual Q-values conditioned on global observation, while the student model utilizes the partial observations to approximate the Q-values estimated by the teacher model. In this way, CTDS balances the full utilization of global observati
Authors
(none)
Tags
Stats
Related papers
- Is Centralized Training With Decentralized Execution Framework Centralized Enough For MARL? (2023)0.00
- Tacit Learning With Adaptive Information Selection For Cooperative Multi-agent Reinforcement Learning (2024)0.00
- GTDE: Grouped Training With Decentralized Execution For Multi-agent Actor-critic (2024)3.58
- PTDE: Personalized Training With Distilled Execution For Multi-agent Reinforcement Learning (2022)0.00
- STAS: Spatial-temporal Return Decomposition For Multi-agent Reinforcement Learning (2023)0.00
- Taming Multi-agent Reinforcement Learning With Estimator Variance Reduction (2022)0.00
- An Initial Introduction To Cooperative Multi-agent Reinforcement Learning (2024)0.00
- From Explicit Communication To Tacit Cooperation:a Novel Paradigm For Cooperative MARL (2023)3.58