Linear Convergence Of Entropy-regularized Natural Policy Gradient With Linear Function Approximation
2021 Β· Semih Cayci, Niao He, R. Srikant
Abstract
Natural policy gradient (NPG) methods with entropy regularization achieve impressive empirical success in reinforcement learning problems with large state-action spaces. However, their convergence properties and the impact of entropy regularization remain elusive in the function approximation regime. In this paper, we establish finite-time convergence analyses of entropy-regularized NPG with linear function approximation under softmax parameterization. In particular, we prove that entropy-regularized NPG with averaging satisfies the *persistence of excitation* condition, and achieves a fast convergence rate of \(\tilde\{O\}(1/T)\) up to a function approximation error in regularized Markov decision processes. This convergence result does not require any a priori assumptions on the policies. Furthermore, under mild regularity conditions on the concentrability coefficient and basis vectors, we prove that entropy-regularized NPG exhibits *linear convergence* up to a function approximation
Authors
(none)
Tags
Stats
Related papers
- Fast Global Convergence Of Natural Policy Gradient Methods With Entropy Regularization (2020)0.00
- Linear Convergence Of Independent Natural Policy Gradient In Games With Entropy Regularization (2024)3.58
- Rethinking The Global Convergence Of Softmax Policy Gradient With Linear Function Approximation (2025)0.00
- Matryoshka Policy Gradient For Entropy-regularized RL: Convergence And Global Optimality (2023)0.00
- Symmetric (optimistic) Natural Policy Gradient For Multi-agent Learning With Parameter Convergence (2022)0.00
- Convergence Of Policy Gradient For Entropy Regularized Mdps With Neural Network Approximation In The Mean-field Regime (2022)0.00
- Beyond Exact Gradients: Convergence Of Stochastic Soft-max Policy Gradient Methods With Entropy Regularization (2021)2.26
- Provably Fast Convergence Of Independent Natural Policy Gradient For Markov Potential Games (2023)0.00