Soft-bellman Equilibrium In Affine Markov Games: Forward Solutions And Inverse Learning
2023 Β· Shenghui Chen, Yue Yu, David Fridovich-Keil, et al.
Abstract
Markov games model interactions among multiple players in a stochastic, dynamic environment. Each player in a Markov game maximizes its expected total discounted reward, which depends upon the policies of the other players. We formulate a class of Markov games, termed affine Markov games, where an affine reward function couples the players' actions. We introduce a novel solution concept, the soft-Bellman equilibrium, where each player is boundedly rational and chooses a soft-Bellman policy rather than a purely rational policy as in the well-known Nash equilibrium concept. We provide conditions for the existence and uniqueness of the soft-Bellman equilibrium and propose a nonlinear least-squares algorithm to compute such an equilibrium in the forward problem. We then solve the inverse game problem of inferring the players' reward parameters from observed state-action trajectories via a projected-gradient algorithm. Experiments in a predator-prey OpenAI Gym environment show that the rewa
Authors
(none)
Tags
Stats
Related papers
- Bounded Risk-sensitive Markov Games: Forward Policy Design And Inverse Reward Learning With Iterative Reasoning And Cumulative Prospect Theory (2020)0.00
- Can We Find Nash Equilibria At A Linear Rate In Markov Games? (2023)0.00
- Learning Equilibria In Adversarial Team Markov Games: A Nonconvex-hidden-concave Min-max Optimization Problem (2024)0.00
- Policy Optimization For Markov Games: Unified Framework And Faster Convergence (2022)0.00
- Efficiently Computing Nash Equilibria In Adversarial Team Markov Games (2022)0.00
- Regret Minimization And Convergence To Equilibria In General-sum Markov Games (2022)0.00
- Hardness Of Independent Learning And Sparse Equilibrium Computation In Markov Games (2023)0.00
- Learning In Markov Games With Adaptive Adversaries: Policy Regret, Fundamental Barriers, And Efficient Algorithms (2024)0.00