Towards Cooperation In Sequential Prisoner's Dilemmas: A Deep Multiagent Reinforcement Learning Approach
2018 Β· Weixun Wang, Jianye Hao, Yixi Wang, et al.
Abstract
The Iterated Prisoner's Dilemma has guided research on social dilemmas for decades. However, it distinguishes between only two atomic actions: cooperate and defect. In real-world prisoner's dilemmas, these choices are temporally extended and different strategies may correspond to sequences of actions, reflecting grades of cooperation. We introduce a Sequential Prisoner's Dilemma (SPD) game to better capture the aforementioned characteristics. In this work, we propose a deep multiagent reinforcement learning approach that investigates the evolution of mutual cooperation in SPD games. Our approach consists of two phases. The first phase is offline: it synthesizes policies with different cooperation degrees and then trains a cooperation degree detection network. The second phase is online: an agent adaptively selects its policy based on the detected degree of opponent cooperation. The effectiveness of our approach is demonstrated in two representative SPD 2D games: the Apple-Pear game and
Authors
(none)
Tags
Stats
Related papers
- Cooperation Dynamics In Multi-agent Systems: Exploring Game-theoretic Scenarios With Mean-field Equilibria (2023)0.00
- Learning Through Probing: A Decentralized Reinforcement Learning Architecture For Social Dilemmas (2018)0.00
- Dilution, Diffusion And Symbiosis In Spatial Prisoner's Dilemma With Reinforcement Learning (2025)0.00
- Learning Multiagent Coordination In The Absence Of Communication Channels (2018)0.00
- Online Learning In Iterated Prisoner's Dilemma To Mimic Human Behavior (2020)0.00
- Intrinsic Fluctuations Of Reinforcement Learning Promote Cooperation (2022)9.23
- Improved Cooperation By Balancing Exploration And Exploitation In Intertemporal Social Dilemma Tasks (2021)0.00
- Symmetric Equilibrium Of Multi-agent Reinforcement Learning In Repeated Prisoner's Dilemma (2021)8.60