Policy Evaluation In Decentralized Pomdps With Belief Sharing
2023 Β· Mert Kayaalp, Fatima Ghadieh, Ali H. Sayed
Abstract
Most works on multi-agent reinforcement learning focus on scenarios where the state of the environment is fully observable. In this work, we consider a cooperative policy evaluation task in which agents are not assumed to observe the environment state directly. Instead, agents can only have access to noisy observations and to belief vectors. It is well-known that finding global posterior distributions under multi-agent settings is generally NP-hard. As a remedy, we propose a fully decentralized belief forming strategy that relies on individual updates and on localized interactions over a communication network. In addition to the exchange of the beliefs, agents exploit the communication network by exchanging value function parameter estimates as well. We analytically show that the proposed strategy allows information to diffuse over the network, which in turn allows the agents' parameters to have a bounded difference with a centralized baseline. A multi-sensor target tracking applicatio
Authors
(none)
Tags
Stats
Related papers
- Belief States For Cooperative Multi-agent Reinforcement Learning Under Partial Observability (2025)0.00
- Centralized Model And Exploration Policy For Multi-agent RL (2021)0.00
- Cooperative Multi-agent Policy Gradients With Sub-optimal Demonstration (2018)0.00
- Agent-state Based Policies In Pomdps: Beyond Belief-state Mdps (2024)0.00
- Off-belief Learning (2021)0.00
- Multi-agent Fully Decentralized Value Function Learning With Linear Convergence Rates (2018)10.21
- More Centralized Training, Still Decentralized Execution: Multi-agent Conditional Policy Factorization (2022)0.00
- Deep Decentralized Multi-task Multi-agent Reinforcement Learning Under Partial Observability (2017)0.00