Online Learning In Stackelberg Games With An Omniscient Follower
2023 Β· Geng Zhao, Banghua Zhu, Jiantao Jiao, et al.
Abstract
We study the problem of online learning in a two-player decentralized cooperative Stackelberg game. In each round, the leader first takes an action, followed by the follower who takes their action after observing the leader's move. The goal of the leader is to learn to minimize the cumulative regret based on the history of interactions. Differing from the traditional formulation of repeated Stackelberg games, we assume the follower is omniscient, with full knowledge of the true reward, and that they always best-respond to the leader's actions. We analyze the sample complexity of regret minimization in this repeated Stackelberg game. We show that depending on the reward structure, the existence of the omniscient follower may change the sample complexity drastically, from constant to exponential, even for linear cooperative Stackelberg games. This poses unique challenges for the learning process of the leader and the subsequent regret analysis.
Authors
(none)
Tags
Stats
Related papers
- Impact Of Decentralized Learning On Player Utilities In Stackelberg Games (2024)0.00
- Online Learning In Unknown Markov Games (2020)0.00
- Model-free Reinforcement Learning For Stochastic Stackelberg Security Games (2020)5.24
- Can Reinforcement Learning Find Stackelberg-nash Equilibria In General-sum Markov Games With Myopic Followers? (2021)0.00
- Sample-efficient Learning Of Stackelberg Equilibria In General-sum Games (2021)0.00
- Online Learning For Uninformed Markov Games: Empirical Nash-value Regret And Non-stationarity Adaptation (2026)0.00
- Decentralized Model-free Reinforcement Learning In Stochastic Games With Average-reward Objective (2023)0.00
- Actions Speak What You Want: Provably Sample-efficient Reinforcement Learning Of The Quantal Stackelberg Equilibrium From Strategic Feedbacks (2023)0.00