Abstract

We investigate learning the equilibria in non-stationary multi-agent systems and address the challenges that differentiate multi-agent learning from single-agent learning. Specifically, we focus on games with bandit feedback, where testing an equilibrium can result in substantial regret even when the gap to be tested is small, and the existence of multiple optimal solutions (equilibria) in stationary games poses extra challenges. To overcome these obstacles, we propose a versatile black-box approach applicable to a broad spectrum of problems, such as general-sum games, potential games, and Markov games, when equipped with appropriate learning and testing oracles for stationary environments. Our algorithms can achieve \(\widetilde\{O\}\left(\Delta^\{1/4\}T^\{3/4\}\right)\) regret when the degree of nonstationarity, as measured by total variation \(\Delta\), is known, and \(\widetilde\{O\}\left(\Delta^\{1/5\}T^\{4/5\}\right)\) regret when \(\Delta\) is unknown, where \(T\) is the number

Authors

(none)

Tags

  • Multi-Agent

Stats

  • citations0
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score0.00
  • arxiv keyjiang2023a

Related papers