Simulation-free PSRO: Removing Game Simulation From Policy Space Response Oracles
2025 Β· Yingzhuo Liu, Shuodi Liu, Weijun Luo, et al.
Abstract
Policy Space Response Oracles (PSRO) combines game-theoretic equilibrium computation with learning and is effective in approximating Nash Equilibrium in zero-sum games. However, the computational cost of PSRO has become a significant limitation to its practical application. Our analysis shows that game simulation is the primary bottleneck in PSRO's runtime. To address this issue, we conclude the concept of Simulation-Free PSRO and summarize existing methods that instantiate this concept. Additionally, we propose a novel Dynamic Window-based Simulation-Free PSRO, which introduces the concept of a strategy window to replace the original strategy set maintained in PSRO. The number of strategies in the strategy window is limited, thereby simplifying opponent strategy selection and improving the robustness of the best response. Moreover, we use Nash Clustering to select the strategy to be eliminated, ensuring that the number of strategies within the strategy window is effectively limited. O
Authors
(none)
Tags
Stats
Related papers
- Pipeline PSRO: A Scalable Approach For Finding Approximate Nash Equilibria In Large Games (2020)0.00
- Fusion-psro: Nash Policy Fusion For Policy Space Response Oracles (2024)3.58
- A Generalized Training Approach For Multiagent Learning (2019)0.00
- Fictitious Cross-play: Learning Global Nash Equilibrium In Mixed Cooperative-competitive Games (2023)3.58
- Conservative Equilibrium Discovery In Offline Game-theoretic Multiagent Reinforcement Learning (2026)0.00
- A Unified Perspective On Deep Equilibrium Finding (2022)0.00
- Learning Equilibria In Mean-field Games: Introducing Mean-field PSRO (2021)0.00
- Offline Fictitious Self-play For Competitive Games (2024)0.00