Learning Through Probing: A Decentralized Reinforcement Learning Architecture For Social Dilemmas
2018 Β· Nicolas Anastassacos, Mirco Musolesi
Abstract
Multi-agent reinforcement learning has received significant interest in recent years notably due to the advancements made in deep reinforcement learning which have allowed for the developments of new architectures and learning algorithms. Using social dilemmas as the training ground, we present a novel learning architecture, Learning through Probing (LTP), where agents utilize a probing mechanism to incorporate how their opponent's behavior changes when an agent takes an action. We use distinct training phases and adjust rewards according to the overall outcome of the experiences accounting for changes to the opponents behavior. We introduce a parameter eta to determine the significance of these future changes to opponent behavior. When applied to the Iterated Prisoner's Dilemma (IPD), LTP agents demonstrate that they can learn to cooperate with each other, achieving higher average cumulative rewards than other reinforcement learning methods while also maintaining good performance in p
Authors
(none)
Tags
Stats
Related papers
- Towards Cooperation In Sequential Prisoner's Dilemmas: A Deep Multiagent Reinforcement Learning Approach (2018)0.00
- Understanding The World To Solve Social Dilemmas Using Multi-agent Reinforcement Learning (2023)0.00
- Online Learning In Iterated Prisoner's Dilemma To Mimic Human Behavior (2020)0.00
- Evolutionary Multi-agent Reinforcement Learning In Group Social Dilemmas (2024)0.00
- Deception In Social Learning: A Multi-agent Reinforcement Learning Perspective (2021)0.00
- Exploring The Impact Of Tunable Agents In Sequential Social Dilemmas (2021)0.00
- Improved Cooperation By Balancing Exploration And Exploitation In Intertemporal Social Dilemma Tasks (2021)0.00
- Learning Multiagent Coordination In The Absence Of Communication Channels (2018)0.00