Heterogeneous Social Value Orientation Leads To Meaningful Diversity In Sequential Social Dilemmas
2023 Β· Udari Madhushani, Kevin R. McKee, John P. Agapiou, et al.
Abstract
In social psychology, Social Value Orientation (SVO) describes an individual's propensity to allocate resources between themself and others. In reinforcement learning, SVO has been instantiated as an intrinsic motivation that remaps an agent's rewards based on particular target distributions of group reward. Prior studies show that groups of agents endowed with heterogeneous SVO learn diverse policies in settings that resemble the incentive structure of Prisoner's dilemma. Our work extends this body of results and demonstrates that (1) heterogeneous SVO leads to meaningfully diverse policies across a range of incentive structures in sequential social dilemmas, as measured by task-specific diversity metrics; and (2) learning a best response to such policy diversity leads to better zero-shot generalization in some situations. We show that these best-response agents learn policies that are conditioned on their co-players, which we posit is the reason for improved zero-shot generalization
Authors
(none)
Tags
Stats
Related papers
- Social Diversity And Social Preferences In Mixed-motive Reinforcement Learning (2020)0.00
- Unifying Behavioral And Response Diversity For Open-ended Learning In Zero-sum Games (2021)0.00
- The Impact Of Behavioral Diversity In Multi-agent Reinforcement Learning (2024)0.00
- DGPO: Discovering Multiple Strategies With Diversity-guided Policy Optimization (2022)2.26
- Dynamics Of Moral Behavior In Heterogeneous Populations Of Learning Agents (2024)4.52
- System Neural Diversity: Measuring Behavioral Heterogeneity In Multi-agent Learning (2023)0.00
- Improved Cooperation By Balancing Exploration And Exploitation In Intertemporal Social Dilemma Tasks (2021)0.00
- Exploring The Impact Of Tunable Agents In Sequential Social Dilemmas (2021)0.00