Deep Reinforcement Learning From Self-play In Imperfect-information Games
2016 Β· Johannes Heinrich, David Silver
Abstract
Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.
Authors
(none)
Tags
Stats
Related papers
- Combining Deep Reinforcement Learning And Search For Imperfect-information Games (2020)0.00
- Score-based Equilibrium Learning In Multi-player Finite Games With Imperfect Information (2023)0.00
- Offline Fictitious Self-play For Competitive Games (2024)0.00
- Improving Fictitious Play Reinforcement Learning With Expanding Models (2019)0.00
- Anticipatory Fictitious Play (2022)0.00
- Approximate Exploitability: Learning A Best Response In Large Games (2020)0.00
- Can Deep Reinforcement Learning Solve Erdos-selfridge-spencer Games? (2017)0.00
- Mastering Strategy Card Game (legends Of Code And Magic) Via End-to-end Policy And Optimistic Smooth Fictitious Play (2023)0.00