Sample-efficient Learning Of Stackelberg Equilibria In General-sum Games
2021 Β· Yu Bai, Chi Jin, Huan Wang, et al.
Abstract
Real world applications such as economics and policy making often involve solving multi-agent games with two unique features: (1) The agents are inherently asymmetric and partitioned into leaders and followers; (2) The agents have different reward functions, thus the game is general-sum. The majority of existing results in this field focuses on either symmetric solution concepts (e.g. Nash equilibrium) or zero-sum games. It remains open how to learn the Stackelberg equilibrium -- an asymmetric analog of the Nash equilibrium -- in general-sum games efficiently from noisy samples. This paper initiates the theoretical study of sample-efficient learning of the Stackelberg equilibrium, in the bandit feedback setting where we only observe noisy samples of the reward. We consider three representative two-player general-sum games: bandit games, bandit-reinforcement learning (bandit-RL) games, and linear bandit games. In all these games, we identify a fundamental gap between the exact value o
Authors
(none)
Tags
Stats
Related papers
- Can Reinforcement Learning Find Stackelberg-nash Equilibria In General-sum Markov Games With Myopic Followers? (2021)0.00
- Actions Speak What You Want: Provably Sample-efficient Reinforcement Learning Of The Quantal Stackelberg Equilibrium From Strategic Feedbacks (2023)0.00
- Model-free Reinforcement Learning For Stochastic Stackelberg Security Games (2020)5.24
- Oracles & Followers: Stackelberg Equilibria In Deep Multi-agent Reinforcement Learning (2022)0.00
- A Black-box Approach For Non-stationary Multi-agent Reinforcement Learning (2023)0.00
- Impact Of Decentralized Learning On Player Utilities In Stackelberg Games (2024)0.00
- Minimax-optimal Multi-agent RL In Markov Games With A Generative Model (2022)2.26
- Improving Sample Efficiency Of Model-free Algorithms For Zero-sum Markov Games (2023)0.00