Instance-optimality In Interactive Decision Making: Toward A Non-asymptotic Theory
2023 Β· Andrew Wagenmaker, Dylan J. Foster
Abstract
We consider the development of adaptive, instance-dependent algorithms for interactive decision making (bandits, reinforcement learning, and beyond) that, rather than only performing well in the worst case, adapt to favorable properties of real-world instances for improved performance. We aim for instance-optimality, a strong notion of adaptivity which asserts that, on any particular problem instance, the algorithm under consideration outperforms all consistent algorithms. Instance-optimality enjoys a rich asymptotic theory originating from the work of \citet\{lai1985asymptotically,graves1997asymptotically\}, but non-asymptotic guarantees have remained elusive outside of certain special cases. Even for problems as simple as tabular reinforcement learning, existing algorithms do not attain instance-optimal performance until the number of rounds of interaction is doubly exponential in the number of states. In this paper, we take the first step toward developing a non-asymptotic theory
Authors
(none)
Tags
Stats
Related papers
- Instance-dependent Confidence And Early Stopping For Reinforcement Learning (2022)0.00
- The Statistical Complexity Of Interactive Decision Making (2021)0.00
- Dynamic Memory For Interpretable Sequential Optimisation (2022)0.00
- Beyond No Regret: Instance-dependent PAC Reinforcement Learning (2021)0.00
- Instance-dependent Near-optimal Policy Identification In Linear Mdps Via Online Experiment Design (2022)0.00
- Accelerated And Instance-optimal Policy Evaluation With Linear Function Approximation (2021)0.00
- Optimality Inductive Biases And Agnostic Guidelines For Offline Reinforcement Learning (2021)0.00
- Online Learning With Costly Features In Non-stationary Environments (2023)0.00