Decentralized Graph-based Multi-agent Reinforcement Learning Using Reward Machines
2021 Β· Jueming Hu, Zhe Xu, Weichang Wang, et al.
Abstract
In multi-agent reinforcement learning (MARL), it is challenging for a collection of agents to learn complex temporally extended tasks. The difficulties lie in computational complexity and how to learn the high-level ideas behind reward functions. We study the graph-based Markov Decision Process (MDP) where the dynamics of neighboring agents are coupled. We use a reward machine (RM) to encode each agent's task and expose reward function internal structures. RM has the capacity to describe high-level knowledge and encode non-Markovian reward functions. We propose a decentralized learning algorithm to tackle computational complexity, called decentralized graph-based reinforcement learning using reward machines (DGRM), that equips each agent with a localized policy, allowing agents to make decisions independently, based on the information available to the agents. DGRM uses the actor-critic structure, and we introduce the tabular Q-function for discrete state problems. We show that the depe
Authors
(none)
Tags
Stats
Related papers
- Fully Decentralized Multi-agent Reinforcement Learning With Networked Agents (2018)0.00
- Reinforcement Learning With Reward Machines In Stochastic Games (2023)0.00
- Mean-field Multi-agent Reinforcement Learning: A Decentralized Network Approach (2021)0.00
- Decentralized Multi-agent Reinforcement Learning With Networked Agents: Recent Advances (2019)0.00
- Learning Reward Machines: A Study In Partially Observable Reinforcement Learning (2021)0.00
- Inferring Probabilistic Reward Machines From Non-markovian Reward Processes For Reinforcement Learning (2021)0.00
- Multi-agent Reinforcement Learning In Stochastic Networked Systems (2020)0.00
- Multi-agent Reinforcement Learning Via Adaptive Kalman Temporal Difference And Successor Representation (2021)0.00