Large-scale Traffic Signal Control Using A Novel Multi-agent Reinforcement Learning
2019 Β· Xiaoqiang Wang, Liangjun Ke, Zhimin Qiao, et al.
Abstract
Finding the optimal signal timing strategy is a difficult task for the problem of large-scale traffic signal control (TSC). Multi-Agent Reinforcement Learning (MARL) is a promising method to solve this problem. However, there is still room for improvement in extending to large-scale problems and modeling the behaviors of other agents for each individual agent. In this paper, a new MARL, called Cooperative double Q-learning (Co-DQL), is proposed, which has several prominent features. It uses a highly scalable independent double Q-learning method based on double estimators and the UCB policy, which can eliminate the over-estimation problem existing in traditional independent Q-learning while ensuring exploration. It uses mean field approximation to model the interaction among agents, thereby making agents learn a better cooperative strategy. In order to improve the stability and robustness of the learning process, we introduce a new reward allocation mechanism and a local state sharing m
Authors
(none)
Tags
Stats
Related papers
- Mean-field Multi-agent Reinforcement Learning: A Decentralized Network Approach (2021)0.00
- An Initial Introduction To Cooperative Multi-agent Reinforcement Learning (2024)0.00
- MARL-LNS: Cooperative Multi-agent Reinforcement Learning Via Large Neighborhoods Search (2024)0.00
- A Review Of Cooperative Multi-agent Deep Reinforcement Learning (2019)19.08
- AC2C: Adaptively Controlled Two-hop Communication For Multi-agent Reinforcement Learning (2023)0.00
- Locality Matters: A Scalable Value Decomposition Approach For Cooperative Multi-agent Reinforcement Learning (2021)0.00
- Multi-agent Reinforcement Learning In Stochastic Networked Systems (2020)0.00
- Strategic Coordination For Evolving Multi-agent Systems: A Hierarchical Reinforcement And Collective Learning Approach (2025)0.00