Unifying Behavioral And Response Diversity For Open-ended Learning In Zero-sum Games
2021 Β· Xiangyu Liu, Hangtian Jia, Ying Wen, et al.
Abstract
Measuring and promoting policy diversity is critical for solving games with strong non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). With that in mind, maintaining a pool of diverse policies via open-ended learning is an attractive solution, which can generate auto-curricula to avoid being exploited. However, in conventional open-ended learning algorithms, there are no widely accepted definitions for diversity, making it hard to construct and evaluate the diverse policies. In this work, we summarize previous concepts of diversity and work towards offering a unified measure of diversity in multi-agent open-ended learning to include all elements in Markov games, based on both Behavioral Diversity (BD) and Response Diversity (RD). At the trajectory distribution level, we re-define BD in the state-action space as the discrepancies of occupancy measures. For the reward dynamics, we propose RD to characterize diversity throug
Authors
(none)
Tags
Stats
Related papers
- The Impact Of Behavioral Diversity In Multi-agent Reinforcement Learning (2024)0.00
- Diverse Policies Converge In Reward-free Markov Decision Processe (2023)0.00
- Effective Diversity In Population Based Reinforcement Learning (2020)0.00
- System Neural Diversity: Measuring Behavioral Heterogeneity In Multi-agent Learning (2023)0.00
- DGPO: Discovering Multiple Strategies With Diversity-guided Policy Optimization (2022)2.26
- Structured Diversity Control: A Dual-level Framework For Group-aware Multi-agent Coordination (2025)0.00
- Diversity-inducing Policy Gradient: Using Maximum Mean Discrepancy To Find A Set Of Diverse Policies (2019)8.35
- Social Diversity And Social Preferences In Mixed-motive Reinforcement Learning (2020)0.00