Learning To Model Opponent Learning
2020 Β· Ian Davies, Zheng Tian, Jun Wang
Abstract
Multi-Agent Reinforcement Learning (MARL) considers settings in which a set of coexisting agents interact with one another and their environment. The adaptation and learning of other agents induces non-stationarity in the environment dynamics. This poses a great challenge for value function-based algorithms whose convergence usually relies on the assumption of a stationary environment. Policy search algorithms also struggle in multi-agent settings as the partial observability resulting from an opponent's actions not being known introduces high variance to policy training. Modelling an agent's opponent(s) is often pursued as a means of resolving the issues arising from the coexistence of learning opponents. An opponent model provides an agent with some ability to reason about other agents to aid its own decision making. Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies. Such an approach ca
Authors
(none)
Tags
Stats
Related papers
- Model-based Opponent Modeling (2021)0.00
- Metric Policy Representations For Opponent Modeling (2021)0.00
- Model-based Multi-agent Policy Optimization With Adaptive Opponent-wise Rollouts (2021)0.00
- Cooperative And Competitive Biases For Multi-agent Reinforcement Learning (2021)2.26
- Modeling The Interaction Between Agents In Cooperative Multi-agent Reinforcement Learning (2021)0.00
- Policy Distillation And Value Matching In Multiagent Reinforcement Learning (2019)10.48
- Variational Autoencoders For Opponent Modeling In Multi-agent Systems (2020)0.00
- Model-based Multi-agent Reinforcement Learning: Recent Progress And Prospects (2022)0.00