Incentivize Without Bonus: Provably Efficient Model-based Online Multi-agent RL For Markov Games
2025 Β· Tong Yang, Bo Dai, Lin Xiao, et al.
Abstract
Multi-agent reinforcement learning (MARL) lies at the heart of a plethora of applications involving the interaction of a group of agents in a shared unknown environment. A prominent framework for studying MARL is Markov games, with the goal of finding various notions of equilibria in a sample-efficient manner, such as the Nash equilibrium (NE) and the coarse correlated equilibrium (CCE). However, existing sample-efficient approaches either require tailored uncertainty estimation under function approximation, or careful coordination of the players. In this paper, we propose a novel model-based algorithm, called VMG, that incentivizes exploration via biasing the empirical estimate of the model parameters towards those with a higher collective best-response values of all the players when fixing the other players' policies, thus encouraging the policy to deviate from its current equilibrium for more exploration. VMG is oblivious to different forms of function approximation, and permits sim
Authors
(none)
Tags
Stats
Related papers
- Minimax-optimal Multi-agent RL In Markov Games With A Generative Model (2022)2.26
- Breaking The Curse Of Multiagency In Robust Multi-agent Reinforcement Learning (2024)0.00
- Efficient Model-based Multi-agent Reinforcement Learning Via Optimistic Equilibrium Computation (2022)0.00
- On Improving Model-free Algorithms For Decentralized Multi-agent Reinforcement Learning (2021)0.00
- V-learning -- A Simple, Efficient, Decentralized Algorithm For Multiagent RL (2021)0.00
- Risk-sensitive Multi-agent Reinforcement Learning In Network Aggregative Markov Games (2024)0.00
- Breaking The Curse Of Multiagency: Provably Efficient Decentralized Multi-agent RL With Function Approximation (2023)0.00
- Faster Last-iterate Convergence Of Policy Optimization In Zero-sum Markov Games (2022)0.00