SA-IGA: A Multiagent Reinforcement Learning Method Towards Socially Optimal Outcomes
2018 Β· Chengwei Zhang, Xiaohong Li, Jianye Hao, et al.
Abstract
In multiagent environments, the capability of learning is important for an agent to behave appropriately in face of unknown opponents and dynamic environment. From the system designer's perspective, it is desirable if the agents can learn to coordinate towards socially optimal outcomes, while also avoiding being exploited by selfish opponents. To this end, we propose a novel gradient ascent based algorithm (SA-IGA) which augments the basic gradient-ascent algorithm by incorporating social awareness into the policy update process. We theoretically analyze the learning dynamics of SA-IGA using dynamical system theory and SA-IGA is shown to have linear dynamics for a wide range of games including symmetric games. The learning dynamics of two representative games (the prisoner's dilemma game and the coordination game) are analyzed in details. Based on the idea of SA-IGA, we further propose a practical multiagent learning algorithm, called SA-PGA, based on Q-learning update rule. Simulation
Authors
(none)
Tags
Stats
Related papers
- Socialgfs: Learning Social Gradient Fields For Multi-agent Reinforcement Learning (2024)0.00
- Independent Generative Adversarial Self-imitation Learning In Cooperative Multiagent Systems (2019)0.00
- Multi-agent Cooperation Through Learning-aware Policy Gradients (2024)0.00
- Inclusive Fitness As A Key Step Towards More Advanced Social Behaviors In Multi-agent Reinforcement Learning Settings (2025)0.00
- A Policy Gradient Algorithm For Learning To Learn In Multiagent Reinforcement Learning (2020)0.00
- Cooperative Artificial Intelligence (2022)0.00
- Social Learning Spontaneously Emerges By Searching Optimal Heuristics With Deep Reinforcement Learning (2022)0.00
- LIGS: Learnable Intrinsic-reward Generation Selection For Multi-agent Learning (2021)0.00