Decentralized Policy Gradient For Nash Equilibria Learning Of General-sum Stochastic Games
2022 Β· Yan Chen, Tao Li
Abstract
We study Nash equilibria learning of a general-sum stochastic game with an unknown transition probability density function. Agents take actions at the current environment state and their joint action influences the transition of the environment state and their immediate rewards. Each agent only observes the environment state and its own immediate reward and is unknown about the actions or immediate rewards of others. We introduce the concepts of weighted asymptotic Nash equilibrium with probability 1 and in probability. For the case with exact pseudo gradients, we design a two-loop algorithm by the equivalence of Nash equilibrium and variational inequality problems. In the outer loop, we sequentially update a constructed strongly monotone variational inequality by updating a proximal parameter while employing a single-call extra-gradient algorithm in the inner loop for solving the constructed variational inequality. We show that if the associated Minty variational inequality has a solu
Authors
(none)
Tags
Stats
Related papers
- On The Convergence Of Policy Gradient Methods To Nash Equilibria In General Stochastic Games (2022)0.00
- Gradient Play In Stochastic Games: Stationary Points, Convergence, And Sample Complexity (2021)0.00
- Asynchronous Gradient Play In Zero-sum Multi-agent Games (2022)0.00
- Independent Policy Gradient For Large-scale Markov Potential Games: Sharper Rates, Function Approximation, And Game-agnostic Convergence (2022)0.00
- Last-iterate Convergence Of Decentralized Optimistic Gradient Descent/ascent In Infinite-horizon Competitive Markov Games (2021)0.00
- Convergence Analysis Of Gradient-based Learning With Non-uniform Learning Rates In Non-cooperative Multi-agent Settings (2019)0.00
- Policy-gradient Algorithms Have No Guarantees Of Convergence In Linear Quadratic Games (2019)5.24
- Regret Minimization And Convergence To Equilibria In General-sum Markov Games (2022)0.00