Transformer-based Scalable Multi-agent Reinforcement Learning For Networked Systems With Long-range Interactions
2025 Β· Vidur Sinha, Muhammed Ustaomeroglu, Guannan Qu
Abstract
Multi-agent reinforcement learning (MARL) has shown promise for large-scale network control, yet existing methods face two major limitations. First, they typically rely on assumptions leading to decay properties of local agent interactions, limiting their ability to capture long-range dependencies such as cascading power failures or epidemic outbreaks. Second, most approaches lack generalizability across network topologies, requiring retraining when applied to new graphs. We introduce STACCA (Shared Transformer Actor-Critic with Counterfactual Advantage), a unified transformer-based MARL framework that addresses both challenges. STACCA employs a centralized Graph Transformer Critic to model long-range dependencies and provide system-level feedback, while its shared Graph Transformer Actor learns a generalizable policy capable of adapting across diverse network structures. Further, to improve credit assignment during training, STACCA integrates a novel counterfactual advantage estimator
Authors
(none)
Tags
Stats
Related papers
- Multi-agent Reinforcement Learning In Stochastic Networked Systems (2020)0.00
- Scalable Multi-agent Reinforcement Learning For Networked Systems With Average Reward (2020)0.00
- Fully Decentralized Multi-agent Reinforcement Learning With Networked Agents (2018)0.00
- Bridging MARL To SARL: An Order-independent Multi-agent Transformer Via Latent Consensus (2026)0.00
- Decentralized Multi-agent Reinforcement Learning With Networked Agents: Recent Advances (2019)0.00
- Transformer-based Value Function Decomposition For Cooperative Multi-agent Reinforcement Learning In Starcraft (2022)8.82
- Bridging Training And Execution Via Dynamic Directed Graph-based Communication In Cooperative Multi-agent Systems (2024)0.00
- AC2C: Adaptively Controlled Two-hop Communication For Multi-agent Reinforcement Learning (2023)0.00