KD-MARL: Resource-aware Knowledge Distillation In Multi-agent Reinforcement Learning
2026 Β· Monirul Islam Pavel, Siyi Hu, Muhammad Anwar Masum, et al.
Abstract
Real world deployment of multi agent reinforcement learning MARL systems is fundamentally constrained by limited compute memory and inference time. While expert policies achieve high performance they rely on costly decision cycles and large scale models that are impractical for edge devices or embedded platforms. Knowledge distillation KD offers a promising path toward resource aware execution but existing KD methods in MARL focus narrowly on action imitation often neglecting coordination structure and assuming uniform agent capabilities. We propose resource aware Knowledge Distillation for Multi Agent Reinforcement Learning KD MARL a two stage framework that transfers coordinated behavior from a centralized expert to lightweight decentralized student agents. The student policies are trained without a critic relying instead on distilled advantage signals and structured policy supervision to preserve coordination under heterogeneous and limited observations. Our approach transfers both
Authors
(none)
Tags
Stats
Related papers
- Knowru: Knowledge Reusing Via Knowledge Distillation In Multi-agent Reinforcement Learning (2021)9.23
- Policy Distillation And Value Matching In Multiagent Reinforcement Learning (2019)10.48
- Knowsr: Knowledge Sharing Among Homogeneous Agents In Multi-agent Reinforcement Learning (2021)0.00
- Contextual Knowledge Sharing In Multi-agent Reinforcement Learning With Decentralized Communication And Coordination (2025)0.00
- Cautiously-optimistic Knowledge Sharing For Cooperative Multi-agent Reinforcement Learning (2023)5.84
- Multi-agent Reinforcement Learning Via Adaptive Kalman Temporal Difference And Successor Representation (2021)0.00
- Hypermarl: Adaptive Hypernetworks For Multi-agent RL (2024)0.00
- Towards Global Optimality In Cooperative MARL With The Transformation And Distillation Framework (2022)0.00