Combining Tree-search, Generative Models, And Nash Bargaining Concepts In Game-theoretic Reinforcement Learning
2023 Β· Zun Li, Marc Lanctot, Kevin R. McKee, et al.
Abstract
Opponent modeling methods typically involve two crucial steps: building a belief distribution over opponents' strategies, and exploiting this opponent model by playing a best response. However, existing approaches typically require domain-specific heurstics to come up with such a model, and algorithms for approximating best responses are hard to scale in large, imperfect information domains. In this work, we introduce a scalable and generic multiagent training regime for opponent modeling using deep game-theoretic reinforcement learning. We first propose Generative Best Respoonse (GenBR), a best response algorithm based on Monte-Carlo Tree Search (MCTS) with a learned deep generative model that samples world states during planning. This new method scales to large imperfect information domains and can be plug and play in a variety of multiagent algorithms. We use this new method under the framework of Policy Space Response Oracles (PSRO), to automate the generation of an *offline oppo
Authors
(none)
Tags
Stats
Related papers
- Generative Evolutionary Meta-solver (GEMS): Scalable Surrogate-free Multi-agent Reinforcement Learning (2025)0.00
- A Generalized Training Approach For Multiagent Learning (2019)0.00
- Brexit: On Opponent Modelling In Expert Iteration (2022)0.00
- Know Your Enemy: Investigating Monte-carlo Tree Search With Opponent Models In Pommerman (2023)0.00
- Neural Auto-curricula (2021)0.00
- Incentivize Without Bonus: Provably Efficient Model-based Online Multi-agent RL For Markov Games (2025)0.00
- Combining Off And On-policy Training In Model-based Reinforcement Learning (2021)0.00
- Minimax-optimal Multi-agent RL In Markov Games With A Generative Model (2022)2.26