Best-response Dynamics And Fictitious Play In Identical-interest And Zero-sum Stochastic Games
2021 Β· Lucas Baudin, Rida Laraki
Abstract
This paper combines ideas from Q-learning and fictitious play to define three reinforcement learning procedures which converge to the set of stationary mixed Nash equilibria in identical interest discounted stochastic games. First, we analyse three continuous-time systems that generalize the best-response dynamics defined by Leslie et al. for zero-sum discounted stochastic games. Under some assumptions depending on the system, the dynamics are shown to converge to the set of stationary equilibria in identical interest discounted stochastic games. Then, we introduce three analog discrete-time procedures in the spirit of Sayin et al. and demonstrate their convergence to the set of stationary equilibria using our results in continuous time together with stochastic approximation techniques. Some numerical experiments complement our theoretical findings.
Authors
(none)
Tags
Stats
Related papers
- Fictitious Play In Zero-sum Stochastic Games (2020)0.00
- On The Heterogeneity Of Independent Learning Dynamics In Zero-sum Stochastic Games (2021)0.00
- Convergence Of Heterogeneous Learning Dynamics In Zero-sum Stochastic Games (2023)2.26
- On The Global Convergence Of Stochastic Fictitious Play In Stochastic Games With Turn-based Controllers (2022)0.00
- Actor-dual-critic Dynamics For Zero-sum And Identical-interest Stochastic Games (2026)0.00
- Last-iterate Convergence Of Payoff-based Independent Learning In Zero-sum Stochastic Games (2024)0.00
- A Finite-sample Analysis Of Payoff-based Independent Learning In Zero-sum Stochastic Games (2023)0.00
- Fictitious Play In Markov Games With Single Controller (2022)6.77