Finite-time Analysis Of Minimax Q-learning For Two-player Zero-sum Markov Games: Switching System Approach
2023 Β· Donghwan Lee
Abstract
The objective of this paper is to investigate the finite-time analysis of a Q-learning algorithm applied to two-player zero-sum Markov games. Specifically, we establish a finite-time analysis of both the minimax Q-learning algorithm and the corresponding value iteration method. To enhance the analysis of both value iteration and Q-learning, we employ the switching system model of minimax Q-learning and the associated value iteration. This approach provides further insights into minimax Q-learning and facilitates a more straightforward and insightful convergence analysis. We anticipate that the introduction of these additional insights has the potential to uncover novel connections and foster collaboration between concepts in the fields of control theory and reinforcement learning communities.
Authors
(none)
Tags
Stats
Related papers
- A Generalized Minimax Q-learning Algorithm For Two-player Zero-sum Stochastic Games (2019)9.03
- Finite-time Error Analysis Of Soft Q-learning: Switching System Approach (2024)0.00
- A Discrete-time Switching System Analysis Of Q-learning (2021)8.35
- Finite-time Analysis Of Asynchronous Q-learning Under Diminishing Step-size From Control-theoretic View (2022)3.58
- Decentralized Q-learning In Zero-sum Markov Games (2021)0.00
- FM3Q: Factorized Multi-agent Minimax Q-learning For Two-team Zero-sum Markov Game (2024)6.34
- On The Heterogeneity Of Independent Learning Dynamics In Zero-sum Stochastic Games (2021)0.00
- Two-timescale Q-learning With Function Approximation In Zero-sum Stochastic Games (2023)0.00