Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-making For Continuous Monitoring
2023 Β· Runzhe Wan, Yu Liu, James McQueen, et al.
Abstract
With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible. Therefore, there is an increasing demand for an efficient continuous monitoring service that allows early stopping when appropriate. Classic statistical methods focus on hypothesis testing and are mostly developed for traditional high-stake problems such as clinical trials, while experiments at online service companies typically have very different features and focuses. Motivated by the real needs, in this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost. We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function. We discuss extensively practical design choices and considerations. We further introduce how to solve the optimal decision rule via Reinforcement Learning and scale the solution. We show th
Authors
(none)
Tags
Stats
Related papers
- Sequential Bayesian Experimental Designs Via Reinforcement Learning (2022)0.00
- Dynamic Memory For Interpretable Sequential Optimisation (2022)0.00
- Statistically Efficient Bayesian Sequential Experiment Design Via Reinforcement Learning With Cross-entropy Estimators (2023)0.00
- Performance Comparisons Of Reinforcement Learning Algorithms For Sequential Experimental Design (2025)0.00
- Design Experiments To Compare Multi-armed Bandit Algorithms (2026)0.00
- An Experimental Design Perspective On Model-based Reinforcement Learning (2021)0.00
- Accelerating Offline Reinforcement Learning Application In Real-time Bidding And Recommendation: Potential Use Of Simulation (2021)0.00
- Online Matching Via Reinforcement Learning: An Expert Policy Orchestration Strategy (2025)0.00