Never Explore Repeatedly In Multi-agent Reinforcement Learning
2023 Β· Chenghao Li, Tonghan Wang, Chongjie Zhang, et al.
Abstract
In the realm of multi-agent reinforcement learning, intrinsic motivations have emerged as a pivotal tool for exploration. While the computation of many intrinsic rewards relies on estimating variational posteriors using neural network approximators, a notable challenge has surfaced due to the limited expressive capability of these neural statistics approximators. We pinpoint this challenge as the "revisitation" issue, where agents recurrently explore confined areas of the task space. To combat this, we propose a dynamic reward scaling approach. This method is crafted to stabilize the significant fluctuations in intrinsic rewards in previously explored areas and promote broader exploration, effectively curbing the revisitation phenomenon. Our experimental findings underscore the efficacy of our approach, showcasing enhanced performance in demanding environments like Google Research Football and StarCraft II micromanagement tasks, especially in sparse reward settings.
Authors
(none)
Tags
Stats
Related papers
- Coordinated Exploration Via Intrinsic Rewards For Multi-agent Reinforcement Learning (2019)0.00
- Curiosity-driven Exploration In Sparse-reward Multi-agent Reinforcement Learning (2023)0.00
- Rewarding Episodic Visitation Discrepancy For Exploration In Reinforcement Learning (2022)0.00
- Curiosity-driven Multi-agent Exploration With Mixed Objectives (2022)0.00
- Exploration With Unreliable Intrinsic Reward In Multi-agent Reinforcement Learning (2019)0.00
- The Impact Of Intrinsic Rewards On Exploration In Reinforcement Learning (2025)0.00
- Exploration And Incentives In Reinforcement Learning (2021)8.09
- Long-term Visitation Value For Deep Exploration In Sparse Reward Reinforcement Learning (2020)7.24