A Deeper Understanding Of State-based Critics In Multi-agent Reinforcement Learning
2022 Β· Xueguang Lyu, Andrea Baisero, Yuchen Xiao, et al.
Abstract
Centralized Training for Decentralized Execution, where training is done in a centralized offline fashion, has become a popular solution paradigm in Multi-Agent Reinforcement Learning. Many such methods take the form of actor-critic with state-based critics, since centralized training allows access to the true system state, which can be useful during training despite not being available at execution time. State-based critics have become a common empirical choice, albeit one which has had limited theoretical justification or analysis. In this paper, we show that state-based critics can introduce bias in the policy gradient estimates, potentially undermining the asymptotic guarantees of the algorithm. We also show that, even if the state-based critics do not introduce any bias, they can still result in a larger gradient variance, contrary to the common intuition. Finally, we show the effects of the theories in practice by comparing different forms of centralized critics on a wide range o
Authors
(none)
Tags
Stats
Related papers
- On Centralized Critics In Multi-agent Reinforcement Learning (2024)9.03
- Contrasting Centralized And Decentralized Critics In Multi-agent Reinforcement Learning (2021)0.00
- Actor-attention-critic For Multi-agent Reinforcement Learning (2018)0.00
- Reducing Overestimation Bias In Multi-agent Domains Using Double Centralized Critics (2019)0.00
- Local Advantage Actor-critic For Robust Multi-agent Deep Reinforcement Learning (2021)7.81
- Communication-efficient Actor-critic Methods For Homogeneous Markov Games (2022)0.00
- Learning Implicit Credit Assignment For Cooperative Multi-agent Reinforcement Learning (2020)0.00
- F2A2: Flexible Fully-decentralized Approximate Actor-critic For Cooperative Multi-agent Reinforcement Learning (2020)0.00