"other-play" For Zero-shot Coordination
2020 Β· Hengyuan Hu, Adam Lerer, Alex Peysakhovich, et al.
Abstract
We consider the problem of zero-shot coordination - constructing AI agents that can coordinate with novel partners they have not seen before (e.g. humans). Standard Multi-Agent Reinforcement Learning (MARL) methods typically focus on the self-play (SP) setting where agents construct strategies by playing the game with themselves repeatedly. Unfortunately, applying SP naively to the zero-shot coordination problem can produce agents that establish highly specialized conventions that do not carry over to novel partners they have not been trained with. We introduce a novel learning algorithm called other-play (OP), that enhances self-play by looking for more robust strategies, exploiting the presence of known symmetries in the underlying problem. We characterize OP theoretically as well as experimentally. We study the cooperative card game Hanabi and show that OP agents achieve higher scores when paired with independently trained agents. In preliminary results we also show that our OP agen
Authors
(none)
Tags
Stats
Related papers
- Towards Few-shot Coordination: Revisiting Ad-hoc Teamplay Challenge In The Game Of Hanabi (2023)0.00
- Tackling Cooperative Incompatibility For Zero-shot Human-ai Coordination (2023)0.00
- Cooperative Open-ended Learning Framework For Zero-shot Coordination (2023)0.00
- Role Play: Learning Adaptive Role-specific Strategies In Multi-agent Interactions (2024)0.00
- A New Formalism, Method And Open Issues For Zero-shot Coordination (2021)0.00
- Heterogeneous Multi-agent Zero-shot Coordination By Coevolution (2022)5.24
- Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination (2025)0.00
- Mastering Zero-shot Interactions In Cooperative And Competitive Simultaneous Games (2024)0.00