Global Convergence Of Policy Gradient Methods In Reinforcement Learning, Games And Control
2023 Β· Shicong Cen, Yuejie Chi
Abstract
Policy gradient methods, where one searches for the policy of interest by maximizing the value functions using first-order information, become increasingly popular for sequential decision making in reinforcement learning, games, and control. Guaranteeing the global optimality of policy gradient methods, however, is highly nontrivial due to nonconcavity of the value functions. In this exposition, we highlight recent progresses in understanding and developing policy gradient methods with global convergence guarantees, putting an emphasis on their finite-time convergence rates with regard to salient problem parameters.
Authors
(none)
Tags
Stats
Related papers
- Global Convergence Of Policy Gradient For Linear-quadratic Mean-field Control/game In Continuous Time (2020)0.00
- On The Theory Of Policy Gradient Methods: Optimality, Approximation, And Distribution Shift (2019)0.00
- Linear Convergence Of A Policy Gradient Method For Some Finite Horizon Continuous Time Control Problems (2022)0.00
- Global Convergence Using Policy Gradient Methods For Model-free Markovian Jump Linear Quadratic Control (2021)0.00
- Convergence And Optimality Of Policy Gradient Methods In Weakly Smooth Settings (2021)3.58
- On The Convergence Of Policy Gradient Methods To Nash Equilibria In General Stochastic Games (2022)0.00
- On The Linear Convergence Of Natural Policy Gradient Algorithm (2021)0.00
- On The Convergence Of Discounted Policy Gradient Methods (2022)0.00