Multi-agent Reinforcement Learning In Cournot Games
2020 Β· Yuanyuan Shi, Baosen Zhang
Abstract
In this work, we study the interaction of strategic agents in continuous action Cournot games with limited information feedback. Cournot game is the essential market model for many socio-economic systems where agents learn and compete without the full knowledge of the system or each other. We consider the dynamics of the policy gradient algorithm, which is a widely adopted continuous control reinforcement learning algorithm, in concave Cournot games. We prove the convergence of policy gradient dynamics to the Nash equilibrium when the price function is linear or the number of agents is two. This is the first result (to the best of our knowledge) on the convergence property of learning algorithms with continuous action spaces that do not fall in the no-regret class.
Authors
(none)
Tags
Stats
Related papers
- Policy-gradient Algorithms Have No Guarantees Of Convergence In Linear Quadratic Games (2019)5.24
- Convergence Analysis Of Gradient-based Learning With Non-uniform Learning Rates In Non-cooperative Multi-agent Settings (2019)0.00
- Global Convergence Of Policy Gradient For Linear-quadratic Mean-field Control/game In Continuous Time (2020)0.00
- The Bounds Of Algorithmic Collusion; \(q\)-learning, Gradient Learning, And The Folk Theorem (2024)0.00
- On Information Asymmetry In Competitive Multi-agent Reinforcement Learning: Convergence And Optimality (2020)0.00
- Exploration-exploitation In Multi-agent Competition: Convergence With Bounded Rationality (2021)0.00
- On The Convergence Of Policy Gradient Methods To Nash Equilibria In General Stochastic Games (2022)0.00
- Beyond Strict Competition: Approximate Convergence Of Multi Agent Q-learning Dynamics (2023)0.00