Stackelberg POMDP: A Reinforcement Learning Approach For Economic Design
2022 Β· Gianluca Brero, Alon Eden, Darshan Chakrabarti, et al.
Abstract
We introduce a reinforcement learning framework for economic design where the interaction between the environment designer and the participants is modeled as a Stackelberg game. In this game, the designer (leader) sets up the rules of the economic system, while the participants (followers) respond strategically. We integrate algorithms for determining followers' response strategies into the leader's learning environment, providing a formulation of the leader's learning problem as a POMDP that we call the Stackelberg POMDP. We prove that the optimal leader's strategy in the Stackelberg game is the optimal policy in our Stackelberg POMDP under a limited set of possible policies, establishing a connection between solving POMDPs and Stackelberg games. We solve our POMDP under a limited set of policy options via the centralized training with decentralized execution framework. For the specific case of followers that are modeled as no-regret learners, we solve an array of increasingly complex
Authors
(none)
Tags
Stats
Related papers
- Model-free Reinforcement Learning For Stochastic Stackelberg Security Games (2020)5.24
- Oracles & Followers: Stackelberg Equilibria In Deep Multi-agent Reinforcement Learning (2022)0.00
- Decentralized Reinforcement Learning: Global Decision-making Via Local Economic Transactions (2020)0.00
- Online Learning In Stackelberg Games With An Omniscient Follower (2023)0.00
- Can Reinforcement Learning Find Stackelberg-nash Equilibria In General-sum Markov Games With Myopic Followers? (2021)0.00
- Stackelberg Batch Policy Learning (2023)0.00
- Stackelberg Games For Learning Emergent Behaviors During Competitive Autocurricula (2023)5.84
- Learning To Design Games: Strategic Environments In Reinforcement Learning (2017)0.00