The Role Of Lookahead And Approximate Policy Evaluation In Reinforcement Learning With Linear Value Function Approximation
2021 Β· Anna Winnicki, Joseph Lubars, Michael Livesay, et al.
Abstract
Function approximation is widely used in reinforcement learning to handle the computational difficulties associated with very large state spaces. However, function approximation introduces errors which may lead to instabilities when using approximate dynamic programming techniques to obtain the optimal policy. Therefore, techniques such as lookahead for policy improvement and m-step rollout for policy evaluation are used in practice to improve the performance of approximate dynamic programming with function approximation. We quantitatively characterize, for the first time, the impact of lookahead and m-step rollout on the performance of approximate dynamic programming (DP) with function approximation: (i) without a sufficient combination of lookahead and m-step rollout, approximate DP may not converge, (ii) both lookahead and m-step rollout improve the convergence rate of approximate DP, and (iii) lookahead helps mitigate the effect of function approximation and the discount factor on
Authors
(none)
Tags
Stats
Related papers
- Reinforcement Learning With Unbiased Policy Evaluation And Linear Function Approximation (2022)0.00
- Provably Efficient Reinforcement Learning With Linear Function Approximation (2019)11.76
- Linear Function Approximation As A Computationally Efficient Method To Solve Classical Reinforcement Learning Challenges (2024)0.00
- Adaptive Approximate Policy Iteration (2020)0.00
- Accelerated And Instance-optimal Policy Evaluation With Linear Function Approximation (2021)0.00
- Minimax-optimal Off-policy Evaluation With Linear Function Approximation (2020)0.00
- The Optimal Approximation Factors In Misspecified Off-policy Value Function Estimation (2023)0.00
- A Unifying View Of Linear Function Approximation In Off-policy RL Through Matrix Splitting And Preconditioning (2025)0.00