Symmetric Equilibrium Of Multi-agent Reinforcement Learning In Repeated Prisoner's Dilemma
2021 Β· Yuki Usui, Masahiko Ueda
Abstract
We investigate the repeated prisoner's dilemma game where both players alternately use reinforcement learning to obtain their optimal memory-one strategies. We theoretically solve the simultaneous Bellman optimality equations of reinforcement learning. We find that the Win-stay Lose-shift strategy, the Grim strategy, and the strategy which always defects can form symmetric equilibrium of the mutual reinforcement learning process amongst all deterministic memory-one strategies.
Authors
(none)
Tags
Stats
Related papers
- Memory-two Strategies Forming Symmetric Mutual Reinforcement Learning Equilibrium In Repeated Prisoners' Dilemma Game (2021)4.52
- Towards Cooperation In Sequential Prisoner's Dilemmas: A Deep Multiagent Reinforcement Learning Approach (2018)0.00
- Learning Multiagent Coordination In The Absence Of Communication Channels (2018)0.00
- A Black-box Approach For Non-stationary Multi-agent Reinforcement Learning (2023)0.00
- Dilution, Diffusion And Symbiosis In Spatial Prisoner's Dilemma With Reinforcement Learning (2025)0.00
- Cooperation Dynamics In Multi-agent Systems: Exploring Game-theoretic Scenarios With Mean-field Equilibria (2023)0.00
- Learning In Multi-memory Games Triggers Complex Dynamics Diverging From Nash Equilibrium (2023)0.00
- On Information Asymmetry In Competitive Multi-agent Reinforcement Learning: Convergence And Optimality (2020)0.00