On The Limited Representational Power Of Value Functions And Its Links To Statistical (in)efficiency
2024 Β· David Cheikhi, Daniel Russo
Abstract
Identifying the trade-offs between model-based and model-free methods is a central question in reinforcement learning. Value-based methods offer substantial computational advantages and are sometimes just as statistically efficient as model-based methods. However, focusing on the core problem of policy evaluation, we show information about the transition dynamics may be impossible to represent in the space of value functions. We explore this through a series of case studies focused on structures that arises in many important problems. In several, there is no information loss and value-based methods are as statistically efficient as model based ones. In other closely-related examples, information loss is severe and value-based methods are severely outperformed. A deeper investigation points to the limitations of the representational power as the driver of the inefficiency, as opposed to failure in algorithm design.
Authors
(none)
Tags
Stats
Related papers
- The Value Equivalence Principle For Model-based Reinforcement Learning (2020)0.00
- Deciding What To Model: Value-equivalent Sampling For Reinforcement Learning (2022)0.00
- On The Model-based Stochastic Value Gradient For Continuous Reinforcement Learning (2020)0.00
- Convex Programs And Lyapunov Functions For Reinforcement Learning: A Unified Perspective On The Analysis Of Value-based Methods (2022)2.26
- The Value-improvement Path: Towards Better Representations For Reinforcement Learning (2020)6.77
- High-confidence Error Estimates For Learned Value Functions (2018)0.00
- Between Rate-distortion Theory & Value Equivalence In Model-based Reinforcement Learning (2022)0.00
- Is There Value In Reinforcement Learning? (2025)0.00