Provably Efficient Information-directed Sampling Algorithms For Multi-agent Reinforcement Learning
2024 Β· Qiaosheng Zhang, Chenjia Bai, Shuyue Hu, et al.
Abstract
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS). These algorithms draw inspiration from foundational concepts in information theory, and are proven to be sample efficient in MARL settings such as two-player zero-sum Markov games (MGs) and multi-player general-sum MGs. For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium. The basic algorithm, referred to as MAIDS, employs an asymmetric learning structure where the max-player first solves a minimax optimization problem based on the joint information ratio of the joint policy, and the min-player then minimizes the marginal information ratio with the max-player's policy fixed. Theoretical analyses show that it achieves a Bayesian regret of tilde\{O\}(sqrt\{K\}) for K episodes. To reduce the computational load of MAIDS, we develop an improved algorithm called Reg-MAI
Authors
(none)
Tags
Stats
Related papers
- Incentivize Without Bonus: Provably Efficient Model-based Online Multi-agent RL For Markov Games (2025)0.00
- On Improving Model-free Algorithms For Decentralized Multi-agent Reinforcement Learning (2021)0.00
- Minimax-optimal Multi-agent RL In Markov Games With A Generative Model (2022)2.26
- On The Complexity Of Multi-agent Decision Making: From Learning In Games To Partial Monitoring (2023)0.00
- Sample-efficient Reinforcement Learning Of Partially Observable Markov Games (2022)0.00
- Maximum Entropy Heterogeneous-agent Reinforcement Learning (2023)0.00
- Breaking The Curse Of Multiagency In Robust Multi-agent Reinforcement Learning (2024)0.00
- Efficient Model-based Multi-agent Reinforcement Learning Via Optimistic Equilibrium Computation (2022)0.00