Minimax-optimal Multi-agent Robust Reinforcement Learning

Abstract

Multi-agent robust reinforcement learning, also known as multi-player robust Markov games (RMGs), is a crucial framework for modeling competitive interactions under environmental uncertainties, with wide applications in multi-agent systems. However, existing results on sample complexity in RMGs suffer from at least one of three obstacles: restrictive range of uncertainty level or accuracy, the curse of multiple agents, and the barrier of long horizons, all of which cause existing results to significantly exceed the information-theoretic lower bound. To close this gap, we extend the Q-FTRL algorithm \citep\{li2022minimax\} to the RMGs in finite-horizon setting, assuming access to a generative model. We prove that the proposed algorithm achieves an \(\epsilon\)-robust coarse correlated equilibrium (CCE) with a sample complexity (up to log factors) of \(\widetilde\{O\}\left(H^3S\sum_\{i=1\}^mA_i\min\left\\{H,1/R\right\\}/\epsilon^2\right)\), where \(S\) denotes the number of states, \(A_i

Minimax-optimal Multi-agent Robust Reinforcement Learning

Abstract

Authors

Tags

Stats

Related papers