Modeling The Interaction Between Agents In Cooperative Multi-agent Reinforcement Learning
2021 Β· Xiaoteng Ma, Yiqin Yang, Chenghao Li, et al.
Abstract
Value-based methods of multi-agent reinforcement learning (MARL), especially the value decomposition methods, have been demonstrated on a range of challenging cooperative tasks. However, current methods pay little attention to the interaction between agents, which is essential to teamwork in games or real life. This limits the efficiency of value-based MARL algorithms in the two aspects: collaborative exploration and value function estimation. In this paper, we propose a novel cooperative MARL algorithm named as interactive actor-critic~(IAC), which models the interaction of agents from the perspectives of policy and value function. On the policy side, a multi-agent joint stochastic policy is introduced by adopting a collaborative exploration module, which is trained by maximizing the entropy-regularized expected return. On the value side, we use the shared attention mechanism to estimate the value function of each agent, which takes the impact of the teammates into consideration. At t
Authors
(none)
Tags
Stats
Related papers
- Policy Distillation And Value Matching In Multiagent Reinforcement Learning (2019)10.48
- Adaptive Value Decomposition With Greedy Marginal Contribution Computation For Cooperative Multi-agent Reinforcement Learning (2023)3.58
- Understanding Value Decomposition Algorithms In Deep Cooperative Multi-agent Reinforcement Learning (2022)0.00
- Decomposed Soft Actor-critic Method For Cooperative Multi-agent Reinforcement Learning (2021)0.00
- A Review Of Cooperative Multi-agent Deep Reinforcement Learning (2019)19.08
- Dual Self-awareness Value Decomposition Framework Without Individual Global Max For Cooperative Multi-agent Reinforcement Learning (2023)0.00
- Revisiting Some Common Practices In Cooperative Multi-agent Reinforcement Learning (2022)0.00
- Learning To Coordinate In Multi-agent Systems: A Coordinated Actor-critic Algorithm And Finite-time Guarantees (2021)0.00