On The Impossibility Of Convergence Of Mixed Strategies With No Regret Learning
2020 Β· Vidya Muthukumar, Soham Phade, Anant Sahai
Abstract
We study the limiting behavior of the mixed strategies that result from optimal no-regret learning strategies in a repeated game setting where the stage game is any 2 by 2 competitive game. We consider optimal no-regret algorithms that are mean-based and monotonic in their argument. We show that for any such algorithm, the limiting mixed strategies of the players cannot converge almost surely to any Nash equilibrium. This negative result is also shown to hold under a broad relaxation of these assumptions, including popular variants of Online-Mirror-Descent with optimism and/or adaptive step-sizes. Finally, we conjecture that the monotonicity assumption can be removed, and provide partial evidence for this conjecture. Our results identify the inherent stochasticity in players' realizations as a critical factor underlying this divergence in outcomes between using the opponent's mixtures and realizations to make updates.
Authors
(none)
Tags
Stats
Related papers
- Regret Minimization And Convergence To Equilibria In General-sum Markov Games (2022)0.00
- On The Complexity Of Computing Sparse Equilibria And Lower Bounds For No-regret Learning In Games (2023)0.00
- Best Of Both Worlds: Regret Minimization Versus Minimax Play (2025)0.00
- Last-iterate Convergence Of Decentralized Optimistic Gradient Descent/ascent In Infinite-horizon Competitive Markov Games (2021)0.00
- Hierarchies Of No-regret Algorithms (2026)0.00
- Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-regret Learning In Markov Games (2022)0.00
- Learning In Zero-sum Markov Games: Relaxing Strong Reachability And Mixing Time Assumptions (2023)0.00
- Near Optimal Convergence To Coarse Correlated Equilibrium In General-sum Markov Games (2025)0.00