Mastering Zero-shot Interactions In Cooperative And Competitive Simultaneous Games
2024 Β· Yannik Mahlau, Frederik Schubert, Bodo Rosenhahn
Abstract
The combination of self-play and planning has achieved great successes in sequential games, for instance in Chess and Go. However, adapting algorithms such as AlphaZero to simultaneous games poses a new challenge. In these games, missing information about concurrent actions of other agents is a limiting factor as they may select different Nash equilibria or do not play optimally at all. Thus, it is vital to model the behavior of the other agents when interacting with them in simultaneous games. To this end, we propose Albatross: AlphaZero for Learning Bounded-rational Agents and Temperature-based Response Optimization using Simulated Self-play. Albatross learns to play the novel equilibrium concept of a Smooth Best Response Logit Equilibrium (SBRLE), which enables cooperation and competition with agents of any playing strength. We perform an extensive evaluation of Albatross on a set of cooperative and competitive simultaneous perfect-information games. In contrast to AlphaZero, Albatr
Authors
(none)
Tags
Stats
Related papers
- "other-play" For Zero-shot Coordination (2020)0.00
- Reinforcement Learning In Two Player Zero Sum Simultaneous Action Games (2021)0.00
- Impartial Games: A Challenge For Reinforcement Learning (2022)0.00
- Tackling Cooperative Incompatibility For Zero-shot Human-ai Coordination (2023)0.00
- Cooperative Open-ended Learning Framework For Zero-shot Coordination (2023)0.00
- Fictitious Cross-play: Learning Global Nash Equilibrium In Mixed Cooperative-competitive Games (2023)3.58
- Efficient Competitive Self-play Policy Optimization (2020)0.00
- Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination (2025)0.00