V-learning -- A Simple, Efficient, Decentralized Algorithm For Multiagent RL
2021 Β· Chi Jin, Qinghua Liu, Yuanhao Wang, et al.
Abstract
A major challenge of multiagent reinforcement learning (MARL) is the curse of multiagents, where the size of the joint action space scales exponentially with the number of agents. This remains to be a bottleneck for designing efficient MARL algorithms even in a basic scenario with finitely many states and actions. This paper resolves this challenge for the model of episodic Markov games. We design a new class of fully decentralized algorithms -- V-learning, which provably learns Nash equilibria (in the two-player zero-sum setting), correlated equilibria and coarse correlated equilibria (in the multiplayer general-sum setting) in a number of samples that only scales with \(\max_\{i\in[m]\} A_i\), where \(A_i\) is the number of actions for the \(i^\{\rm th\}\) player. This is in sharp contrast to the size of the joint action space which is \(\prod_\{i=1\}^m A_i\). V-learning (in its basic form) is a new class of single-agent RL algorithms that convert any adversarial bandit algorithm wit
Authors
(none)
Tags
Stats
Related papers
- On Improving Model-free Algorithms For Decentralized Multi-agent Reinforcement Learning (2021)0.00
- Breaking The Curse Of Multiagency: Provably Efficient Decentralized Multi-agent RL With Function Approximation (2023)0.00
- Incentivize Without Bonus: Provably Efficient Model-based Online Multi-agent RL For Markov Games (2025)0.00
- Decentralized Q-learning In Zero-sum Markov Games (2021)0.00
- Minimax-optimal Multi-agent RL In Markov Games With A Generative Model (2022)2.26
- Provably Efficient Reinforcement Learning In Decentralized General-sum Markov Games (2021)0.00
- MA2QL: A Minimalist Approach To Fully Decentralized Multi-agent Reinforcement Learning (2022)0.00
- Mean-field Multi-agent Reinforcement Learning: A Decentralized Network Approach (2021)0.00