Tesseract: Tensorised Actors For Multi-agent Reinforcement Learning
2021 Β· Anuj Mahajan, Mikayel Samvelyan, Lei Mao, et al.
Abstract
Reinforcement Learning in large action spaces is a challenging problem. Cooperative multi-agent reinforcement learning (MARL) exacerbates matters by imposing various constraints on communication and observability. In this work, we consider the fundamental hurdle affecting both value-based and policy-gradient approaches: an exponential blowup of the action space with the number of agents. For value-based methods, it poses challenges in accurately representing the optimal value function. For policy gradient methods, it makes training the critic difficult and exacerbates the problem of the lagging critic. We show that from a learning theory perspective, both problems can be addressed by accurately representing the associated action-value function with a low-complexity hypothesis class. This requires accurately modelling the agent interactions in a sample efficient way. To this end, we propose a novel tensorised formulation of the Bellman equation. This gives rise to our method Tesseract,
Authors
(none)
Tags
Stats
Related papers
- Model Based Multi-agent Reinforcement Learning With Tensor Decompositions (2021)0.00
- Multi-agent Reinforcement Learning In Stochastic Networked Systems (2020)0.00
- Transformer-based Scalable Multi-agent Reinforcement Learning For Networked Systems With Long-range Interactions (2025)0.00
- Scalable Multi-agent Reinforcement Learning For Networked Systems With Average Reward (2020)0.00
- An Initial Introduction To Cooperative Multi-agent Reinforcement Learning (2024)0.00
- Policy Distillation And Value Matching In Multiagent Reinforcement Learning (2019)10.48
- Bridging MARL To SARL: An Order-independent Multi-agent Transformer Via Latent Consensus (2026)0.00
- Modeling The Interaction Between Agents In Cooperative Multi-agent Reinforcement Learning (2021)0.00