Computably Continuous Reinforcement-learning Objectives Are Pac-learnable
2023 Β· Cambridge Yang, Michael Littman, Michael Carbin
Abstract
In reinforcement learning, the classic objectives of maximizing discounted and finite-horizon cumulative rewards are PAC-learnable: There are algorithms that learn a near-optimal policy with high probability using a finite amount of samples and computation. In recent years, researchers have introduced objectives and corresponding reinforcement-learning algorithms beyond the classic cumulative rewards, such as objectives specified as linear temporal logic formulas. However, questions about the PAC-learnability of these new objectives have remained open. This work demonstrates the PAC-learnability of general reinforcement-learning objectives through sufficient conditions for PAC-learnability in two analysis settings. In particular, for the analysis that considers only sample complexity, we prove that if an objective given as an oracle is uniformly continuous, then it is PAC-learnable. Further, for the analysis that considers computational complexity, we prove that if an objective is co
Authors
(none)
Tags
Stats
Related papers
- On The (in)tractability Of Reinforcement Learning For LTL Objectives (2021)0.00
- A PAC Learning Algorithm For LTL And Omega-regular Objectives In Mdps (2023)3.58
- Reinforcement Learning With Non-cumulative Objective (2023)5.24
- On Oracle-efficient PAC RL With Rich Observations (2018)0.00
- Unified Algorithms For RL With Decision-estimation Coefficients: PAC, Reward-free, Preference-based Learning, And Beyond (2022)5.24
- Beyond No Regret: Instance-dependent PAC Reinforcement Learning (2021)0.00
- Provably Efficient Ucb-type Algorithms For Learning Predictive State Representations (2023)0.00
- Efficient PAC Reinforcement Learning In Regular Decision Processes (2021)2.26