Policy-gradient Algorithms Have No Guarantees Of Convergence In Linear Quadratic Games
2019 Β· Eric Mazumdar, Lillian J. Ratliff, Michael I. Jordan, et al.
Abstract
We show by counterexample that policy-gradient algorithms have no guarantees of even local convergence to Nash equilibria in continuous action and state space multi-agent settings. To do so, we analyze gradient-play in N-player general-sum linear quadratic games, a classic game setting which is recently emerging as a benchmark in the field of multi-agent learning. In such games the state and action spaces are continuous and global Nash equilibria can be found be solving coupled Ricatti equations. Further, gradient-play in LQ games is equivalent to multi agent policy-gradient. We first show that these games are surprisingly not convex games. Despite this, we are still able to show that the only critical points of the gradient dynamics are global Nash equilibria. We then give sufficient conditions under which policy-gradient will avoid the Nash equilibria, and generate a large number of general-sum linear quadratic games that satisfy these conditions. In such games we empirically observe
Authors
(none)
Tags
Stats
Related papers
- On The Convergence Of Policy Gradient Methods To Nash Equilibria In General Stochastic Games (2022)0.00
- Reinforcement Learning In Nonzero-sum Linear Quadratic Deep Structured Games: Global Convergence Of Policy Optimization (2020)6.77
- Independent Policy Gradient For Large-scale Markov Potential Games: Sharper Rates, Function Approximation, And Game-agnostic Convergence (2022)0.00
- Convergence Analysis Of Gradient-based Learning With Non-uniform Learning Rates In Non-cooperative Multi-agent Settings (2019)0.00
- Global Convergence Of Policy Gradient For Linear-quadratic Mean-field Control/game In Continuous Time (2020)0.00
- Global Convergence Of Policy Gradient Methods In Reinforcement Learning, Games And Control (2023)0.00
- Symmetric (optimistic) Natural Policy Gradient For Multi-agent Learning With Parameter Convergence (2022)0.00
- Learning Distributed Equilibria In Linear-quadratic Stochastic Differential Games: An \(\alpha\)-potential Approach (2026)0.00