Reinforcement Learning In Two Player Zero Sum Simultaneous Action Games
2021 Β· Patrick Phillips
Abstract
Two player zero sum simultaneous action games are common in video games, financial markets, war, business competition, and many other settings. We first introduce the fundamental concepts of reinforcement learning in two player zero sum simultaneous action games and discuss the unique challenges this type of game poses. Then we introduce two novel agents that attempt to handle these challenges by using joint action Deep Q-Networks (DQN). The first agent, called the Best Response AgenT (BRAT), builds an explicit model of its opponent's policy using imitation learning, and then uses this model to find the best response to exploit the opponent's strategy. The second agent, Meta-Nash DQN, builds an implicit model of its opponent's policy in order to produce a context variable that is used as part of the Q-value calculation. An explicit minimax over Q-values is used to find actions close to Nash equilibrium. We find empirically that both agents converge to Nash equilibrium in a self-play se
Authors
(none)
Tags
Stats
Related papers
- A Deep Reinforcement Learning Approach For Finding Non-exploitable Strategies In Two-player Atari Games (2022)0.00
- Colosseumrl: A Framework For Multiagent Reinforcement Learning In \(n\)-player Games (2019)0.00
- Mastering Zero-shot Interactions In Cooperative And Competitive Simultaneous Games (2024)0.00
- Decentralized Q-learning In Zero-sum Markov Games (2021)0.00
- A Generalized Minimax Q-learning Algorithm For Two-player Zero-sum Stochastic Games (2019)9.03
- Resolving Implicit Coordination In Multi-agent Deep Reinforcement Learning With Deep Q-networks & Game Theory (2020)0.00
- Simplified Action Decoder For Deep Multi-agent Reinforcement Learning (2019)4.03
- Multi-agent Training Beyond Zero-sum With Correlated Equilibrium Meta-solvers (2021)0.00