Abstract

We present a new family of logit-Q dynamics for efficient learning in stochastic games by combining the log-linear learning (also known as logit dynamics) for the repeated play of normal-form games with Q-learning for unknown Markov decision processes within the auxiliary stage-game framework. In this framework, we view stochastic games as agents repeatedly playing some stage game associated with the current state of the underlying game while the agents' Q-functions determine the payoffs of these stage games. We show that the logit-Q dynamics presented reach (near) efficient equilibrium in stochastic teams with unknown dynamics and quantify the approximation error. We also show the rationality of the logit-Q dynamics against agents following pure stationary strategies and the convergence of the dynamics in stochastic games where the stage-payoffs induce potential games, yet only a single agent controls the state transitions beyond stochastic teams. The key idea is to approximate the dy

Authors

(none)

Tags

  • Uncategorized

Stats

  • citations0
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score0.00
  • arxiv keydonmez2023logit

Related papers