DREAM: Deep Regret Minimization With Advantage Baselines And Model-free Learning
2020 Β· Eric Steinberger, Adam Lerer, Noam Brown
Abstract
We introduce DREAM, a deep reinforcement learning algorithm that finds optimal strategies in imperfect-information games with multiple agents. Formally, DREAM converges to a Nash Equilibrium in two-player zero-sum games and to an extensive-form coarse correlated equilibrium in all other games. Our primary innovation is an effective algorithm that, in contrast to other regret-based deep learning algorithms, does not require access to a perfect simulator of the game to achieve good performance. We show that DREAM empirically achieves state-of-the-art performance among model-free algorithms in popular benchmark games, and is even competitive with algorithms that do use a perfect simulator.
Authors
(none)
Tags
Stats
Related papers
- Regret Minimization And Convergence To Equilibria In General-sum Markov Games (2022)0.00
- Model-free Online Learning In Unknown Sequential Decision Making Problems And Games (2021)5.24
- Decentralized Model-free Reinforcement Learning In Stochastic Games With Average-reward Objective (2023)0.00
- A Unified Perspective On Deep Equilibrium Finding (2022)0.00
- Dreamsmooth: Improving Model-based Reinforcement Learning Via Reward Smoothing (2023)0.00
- Evolutionary Dynamics And \(\phi\)-regret Minimization In Games (2021)3.58
- Variance Reduction In Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) For Extensive Form Games Using Baselines (2018)10.48
- Generalized Bandit Regret Minimizer Framework In Imperfect Information Extensive-form Game (2022)0.00