CPIG: Leveraging Consistency Policy With Intention Guidance For Multi-agent Exploration
2024 Β· Yuqian Fu, Yuanheng Zhu, Haoran Li, et al.
Abstract
Efficient exploration is crucial in cooperative multi-agent reinforcement learning (MARL), especially in sparse-reward settings. However, due to the reliance on the unimodal policy, existing methods are prone to falling into the local optima, hindering the effective exploration of better policies. Furthermore, in sparse-reward settings, each agent tends to receive a scarce reward, which poses significant challenges to inter-agent cooperation. This not only increases the difficulty of policy learning but also degrades the overall performance of multi-agent tasks. To address these issues, we propose a Consistency Policy with Intention Guidance (CPIG), with two primary components: (a) introducing a multimodal policy to enhance the agent's exploration capability, and (b) sharing the intention among agents to foster agent cooperation. For component (a), CPIG incorporates a Consistency model as the policy, leveraging its multimodal nature and stochastic characteristics to facilitate explorat
Authors
(none)
Tags
Stats
Related papers
- Prioritized Guidance For Efficient Multi-agent Reinforcement Learning Exploration (2019)0.00
- Coordinated Exploration Via Intrinsic Rewards For Multi-agent Reinforcement Learning (2019)0.00
- Reaching Consensus In Cooperative Multi-agent Reinforcement Learning With Goal Imagination (2024)0.00
- Individual Contributions As Intrinsic Exploration Scaffolds For Multi-agent Reinforcement Learning (2024)2.80
- Co2po: Coordinated Constrained Policy Optimization For Multi-agent RL (2026)0.00
- LIGS: Learnable Intrinsic-reward Generation Selection For Multi-agent Learning (2021)0.00
- Global Convergence Of Localized Policy Iteration In Networked Multi-agent Reinforcement Learning (2022)2.26
- Revisiting Some Common Practices In Cooperative Multi-agent Reinforcement Learning (2022)0.00