Intrinsic Action Tendency Consistency For Cooperative Multi-agent Reinforcement Learning
2024 Β· Junkai Zhang, Yifan Zhang, Xi Sheryl Zhang, et al.
Abstract
Efficient collaboration in the centralized training with decentralized execution (CTDE) paradigm remains a challenge in cooperative multi-agent systems. We identify divergent action tendencies among agents as a significant obstacle to CTDE's training efficiency, requiring a large number of training samples to achieve a unified consensus on agents' policies. This divergence stems from the lack of adequate team consensus-related guidance signals during credit assignments in CTDE. To address this, we propose Intrinsic Action Tendency Consistency, a novel approach for cooperative multi-agent reinforcement learning. It integrates intrinsic rewards, obtained through an action model, into a reward-additive CTDE (RA-CTDE) framework. We formulate an action model that enables surrounding agents to predict the central agent's action tendency. Leveraging these predictions, we compute a cooperative intrinsic reward that encourages agents to match their actions with their neighbors' predictions. We
Authors
(none)
Tags
Stats
Related papers
- Optimistic {\epsilon}-greedy Exploration For Cooperative Multi-agent Reinforcement Learning (2025)0.00
- Learning Implicit Credit Assignment For Cooperative Multi-agent Reinforcement Learning (2020)0.00
- DCIR: Dynamic Consistency Intrinsic Reward For Multi-agent Reinforcement Learning (2023)0.00
- Tacit Learning With Adaptive Information Selection For Cooperative Multi-agent Reinforcement Learning (2024)0.00
- Influence-based Reinforcement Learning For Intrinsically-motivated Agents (2021)0.00
- Taming Multi-agent Reinforcement Learning With Estimator Variance Reduction (2022)0.00
- Coordinated Exploration Via Intrinsic Rewards For Multi-agent Reinforcement Learning (2019)0.00
- Fully Decentralized Cooperative Multi-agent Reinforcement Learning: A Survey (2024)0.00