Sequential Information Design: Markov Persuasion Process And Its Efficient Reinforcement Learning
2022 Β· Jibang Wu, Zixuan Zhang, Zhe Feng, et al.
Abstract
In today's economy, it becomes important for Internet platforms to consider the sequential information design problem to align its long term interest with incentives of the gig service providers. This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs), where a sender, with informational advantage, seeks to persuade a stream of myopic receivers to take actions that maximizes the sender's cumulative utilities in a finite horizon Markovian environment with varying prior and utility functions. Planning in MPPs thus faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender. Nevertheless, in the population level where the model is known, it turns out that we can efficiently determine the optimal (resp. \(\epsilon\)-optimal) policy with finite (resp. infinite) states and outcomes, through a modified formulation
Authors
(none)
Tags
Stats
Related papers
- Markov Persuasion Processes: Learning To Persuade From Scratch (2024)0.00
- Off-policy Evaluation For Sequential Persuasion Process With Unobserved Confounding (2025)0.00
- Optimizing The Long-term Average Reward For Continuing Mdps: A Technical Report (2021)0.00
- Stackelberg POMDP: A Reinforcement Learning Approach For Economic Design (2022)0.00
- Provably Efficient Ucb-type Algorithms For Learning Predictive State Representations (2023)0.00
- Learning To Steer Markovian Agents Under Model Uncertainty (2024)0.00
- Information-theoretic Methods For Planning And Learning In Partially Observable Markov Decision Processes (2016)0.00
- Online Reinforcement Learning In Markov Decision Process Using Linear Programming (2023)3.58