Stackelberg Actor-critic: Game-theoretic Reinforcement Learning Algorithms
2021 Β· Liyuan Zheng, Tanner Fiez, Zane Alumbaugh, et al.
Abstract
The hierarchical interaction between the actor and critic in actor-critic based reinforcement learning algorithms naturally lends itself to a game-theoretic interpretation. We adopt this viewpoint and model the actor and critic interaction as a two-player general-sum game with a leader-follower structure known as a Stackelberg game. Given this abstraction, we propose a meta-framework for Stackelberg actor-critic algorithms where the leader player follows the total derivative of its objective instead of the usual individual gradient. From a theoretical standpoint, we develop a policy gradient theorem for the refined update and provide a local convergence guarantee for the Stackelberg actor-critic algorithms to a local Stackelberg equilibrium. From an empirical standpoint, we demonstrate via simple examples that the learning dynamics we study mitigate cycling and accelerate convergence compared to the usual gradient dynamics given cost structures induced by actor-critic formulations. Fin
Authors
(none)
Tags
Stats
Related papers
- Actor-dual-critic Dynamics For Zero-sum And Identical-interest Stochastic Games (2026)0.00
- Actor-critic Algorithms For Constrained Multi-agent Reinforcement Learning (2019)0.00
- Bi-level Actor-critic For Multi-agent Coordination (2019)0.00
- Single-timescale Actor-critic Provably Finds Globally Optimal Policy (2020)0.00
- An Approximate Policy Iteration Viewpoint Of Actor-critic Algorithms (2022)2.26
- Analysis Of A Target-based Actor-critic Algorithm With Linear Function Approximation (2021)0.00
- Convergence Of Decentralized Actor-critic Algorithm In General-sum Markov Games (2024)3.58
- How To Learn A Useful Critic? Model-based Action-gradient-estimator Policy Optimization (2020)0.00