Fast Teammate Adaptation In The Presence Of Sudden Policy Change
2023 Β· Ziqian Zhang, Lei Yuan, Lihe Li, et al.
Abstract
In cooperative multi-agent reinforcement learning (MARL), where an agent coordinates with teammate(s) for a shared goal, it may sustain non-stationary caused by the policy change of teammates. Prior works mainly concentrate on the policy change during the training phase or teammates altering cross episodes, ignoring the fact that teammates may suffer from policy change suddenly within an episode, which might lead to miscoordination and poor performance as a result. We formulate the problem as an open Dec-POMDP, where we control some agents to coordinate with uncontrolled teammates, whose policies could be changed within one episode. Then we develop a new framework, fast teammates adaptation (Fastap), to address the problem. Concretely, we first train versatile teammates' policies and assign them to different clusters via the Chinese Restaurant Process (CRP). Then, we train the controlled agent(s) to coordinate with the sampled uncontrolled teammates by capturing their identifications a
Authors
(none)
Tags
Stats
Related papers
- Learning To Coordinate With Anyone (2023)0.00
- Modelling The Dynamic Joint Policy Of Teammates With Attention Multi-agent DDPG (2018)5.84
- Adaptive Opponent Policy Detection In Multi-agent Mdps: Real-time Strategy Switch Identification Using Running Error Estimation (2024)0.00
- Hypermarl: Adaptive Hypernetworks For Multi-agent RL (2024)0.00
- Fast Peer Adaptation With Context-aware Exploration (2024)0.00
- Transferable Multi-agent Reinforcement Learning With Dynamic Participating Agents (2022)0.00
- Padiff: Predictive And Adaptive Diffusion Policies For Ad Hoc Teamwork (2025)0.00
- Dealing With Non-stationarity In Decentralized Cooperative Multi-agent Deep Reinforcement Learning Via Multi-timescale Learning (2023)0.00