Neural Auto-curricula
2021 Β· Xidong Feng, Oliver Slumbers, Ziyu Wan, et al.
Abstract
When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often create populations of agents where, at each iteration, a new agent is discovered as the best response to a mixture over the opponent population. Within such a process, the update rules of "who to compete with" (i.e., the opponent mixture) and "how to beat them" (i.e., finding best responses) are underpinned by manually developed game theoretical principles such as fictitious play and Double Oracle. In this paper, we introduce a novel framework -- Neural Auto-Curricula (NAC) -- that leverages meta-gradient descent to automate the discovery of the learning update rule without explicit human design. Specifically, we parameterise the opponent selection module by neural networks and the best-response module by optimisation subroutines, and update their parameters solely via interaction with the game engine, where both players aim to minimise their exploitability. Surprisingly, even without hum
Authors
(none)
Tags
Stats
Related papers
- Accelerate Multi-agent Reinforcement Learning In Zero-sum Games With Subgame Curriculum Learning (2023)0.00
- Stackelberg Games For Learning Emergent Behaviors During Competitive Autocurricula (2023)5.84
- Efficient Competitive Self-play Policy Optimization (2020)0.00
- Colosseumrl: A Framework For Multiagent Reinforcement Learning In \(n\)-player Games (2019)0.00
- Discovering Multiagent Learning Algorithms With Large Language Models (2026)2.05
- Towards Skilled Population Curriculum For Multi-agent Reinforcement Learning (2023)0.00
- A Generalized Training Approach For Multiagent Learning (2019)0.00
- Fictitious Cross-play: Learning Global Nash Equilibrium In Mixed Cooperative-competitive Games (2023)3.58