Efficiently Quantifying Individual Agent Importance In Cooperative MARL
2023 Β· Omayma Mahjoub, Ruan de Kock, Siddarth Singh, et al.
Abstract
Measuring the contribution of individual agents is challenging in cooperative multi-agent reinforcement learning (MARL). In cooperative MARL, team performance is typically inferred from a single shared global reward. Arguably, among the best current approaches to effectively measure individual agent contributions is to use Shapley values. However, calculating these values is expensive as the computational complexity grows exponentially with respect to the number of agents. In this paper, we adapt difference rewards into an efficient method for quantifying the contribution of individual agents, referred to as Agent Importance, offering a linear computational complexity relative to the number of agents. We show empirically that the computed values are strongly correlated with the true Shapley values, as well as the true underlying individual agent rewards, used as the ground truth in environments where these are available. We demonstrate how Agent Importance can be used to help study MAR
Authors
(none)
Tags
Stats
Related papers
- Adaptive Value Decomposition With Greedy Marginal Contribution Computation For Cooperative Multi-agent Reinforcement Learning (2023)3.58
- Collective Explainable AI: Explaining Cooperative Strategies And Agent Contribution In Multiagent Reinforcement Learning With Shapley Values (2021)0.00
- Shapley Q-value: A Local Reward Approach To Solve Global Reward Games (2019)13.65
- Locality Matters: A Scalable Value Decomposition Approach For Cooperative Multi-agent Reinforcement Learning (2021)0.00
- DIFFER: Decomposing Individual Reward For Fair Experience Replay In Multi-agent Reinforcement Learning (2023)2.26
- SHAQ: Incorporating Shapley Value Theory Into Multi-agent Q-learning (2021)0.00
- Quantifying Agent Interaction In Multi-agent Reinforcement Learning For Cost-efficient Generalization (2023)0.00
- Shapley Counterfactual Credits For Multi-agent Reinforcement Learning (2021)12.40