Updet: Universal Multi-agent Reinforcement Learning Via Policy Decoupling With Transformers
2021 Β· Siyi Hu, Fengda Zhu, Xiaojun Chang, et al.
Abstract
Recent advances in multi-agent reinforcement learning have been largely limited in training one model from scratch for every new task. The limitation is due to the restricted model architecture related to fixed input and output dimensions. This hinders the experience accumulation and transfer of the learned agent over tasks with diverse levels of difficulty (e.g. 3 vs 3 or 5 vs 6 multi-agent games). In this paper, we make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks with the requirement of different observation and action configurations. Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy by decoupling the policy distribution from the intertwined input observation with an importance weight measured by the merits of the self-attention mechanism. Compared to a standard transformer block, the proposed model, named as Universal Policy Decoupling Transform
Authors
(none)
Tags
Stats
Related papers
- Decentralized Transformers With Centralized Aggregation Are Sample-efficient Multi-agent World Models (2024)0.00
- Model Based Multi-agent Reinforcement Learning With Tensor Decompositions (2021)0.00
- Multi-agent Transformer-accelerated RL For Satisfaction Of STL Specifications (2024)0.00
- Making Universal Policies Universal (2025)0.00
- Deep Decentralized Multi-task Multi-agent Reinforcement Learning Under Partial Observability (2017)0.00
- Scalable Centralized Deep Multi-agent Reinforcement Learning Via Policy Gradients (2018)0.00
- Deep Multiagent Reinforcement Learning: Challenges And Directions (2021)0.00
- Transformer Based Reinforcement Learning For Games (2019)0.00