Maximizing Utility In Multi-agent Environments By Anticipating The Behavior Of Other Learners
2024 Β· Angelos Assos, Yuval Dagan, Constantinos Daskalakis
Abstract
Learning algorithms are often used to make decisions in sequential decision-making environments. In multi-agent settings, the decisions of each agent can affect the utilities/losses of the other agents. Therefore, if an agent is good at anticipating the behavior of the other agents, in particular how they will make decisions in each round as a function of their experience that far, it could try to judiciously make its own decisions over the rounds of the interaction so as to influence the other agents to behave in a way that ultimately benefits its own utility. In this paper, we study repeated two-player games involving two types of agents: a learner, which employs an online learning algorithm to choose its strategy in each round; and an optimizer, which knows the learner's utility function and the learner's online learning algorithm. The optimizer wants to plan ahead to maximize its own utility, while taking into account the learner's behavior. We provide two results: a positive resul
Authors
(none)
Tags
Stats
Related papers
- Opponent Learning Awareness And Modelling In Multi-objective Normal Form Games (2020)7.16
- Learning A Game By Paying The Agents (2025)0.00
- Impact Of Decentralized Learning On Player Utilities In Stackelberg Games (2024)0.00
- Non-cooperative Multi-agent Systems With Exploring Agents (2020)0.00
- Coordinating Fully-cooperative Agents Using Hierarchical Learning Anticipation (2023)0.00
- Prediction-aware Learning In Multi-agent Systems (2025)0.00
- All By Myself: Learning Individualized Competitive Behaviour With A Contrastive Reinforcement Learning Optimization (2023)7.16
- Online Learning With Costly Features In Non-stationary Environments (2023)0.00