AIIR-MIX: Multi-agent Reinforcement Learning Meets Attention Individual Intrinsic Reward Mixing Network
2023 Β· Wei Li, Weiyan Liu, Shitong Shao, et al.
Abstract
Deducing the contribution of each agent and assigning the corresponding reward to them is a crucial problem in cooperative Multi-Agent Reinforcement Learning (MARL). Previous studies try to resolve the issue through designing an intrinsic reward function, but the intrinsic reward is simply combined with the environment reward by summation in these studies, which makes the performance of their MARL framework unsatisfactory. We propose a novel method named Attention Individual Intrinsic Reward Mixing Network (AIIR-MIX) in MARL, and the contributions of AIIR-MIX are listed as follows:(a) we construct a novel intrinsic reward network based on the attention mechanism to make teamwork more effective. (b) we propose a Mixing network that is able to combine intrinsic and extrinsic rewards non-linearly and dynamically in response to changing conditions of the environment. We compare AIIR-MIX with many State-Of-The-Art (SOTA) MARL methods on battle games in StarCraft II. And the results demonstr
Authors
(none)
Tags
Stats
Related papers
- Mixrts: Toward Interpretable Multi-agent Reinforcement Learning Via Mixing Recurrent Soft Decision Trees (2022)7.16
- Influence-based Reinforcement Learning For Intrinsically-motivated Agents (2021)0.00
- Attention-guided Contrastive Role Representations For Multi-agent Reinforcement Learning (2023)3.64
- MIR: Efficient Exploration In Episodic Multi-agent Reinforcement Learning Via Mutual Intrinsic Reward (2025)0.00
- DIFFER: Decomposing Individual Reward For Fair Experience Replay In Multi-agent Reinforcement Learning (2023)2.26
- Enhancing Heterogeneous Multi-agent Cooperation In Decentralized MARL Via Gnn-driven Intrinsic Rewards (2024)0.00
- MMD-MIX: Value Function Factorisation With Maximum Mean Discrepancy For Cooperative Multi-agent Reinforcement Learning (2021)0.00
- Agent-time Attention For Sparse Rewards Multi-agent Reinforcement Learning (2022)0.00