Rethinking Model-based, Policy-based, And Value-based Reinforcement Learning Via The Lens Of Representation Complexity
2023 Β· Guhao Feng, Han Zhong
Abstract
Reinforcement Learning (RL) encompasses diverse paradigms, including model-based RL, policy-based RL, and value-based RL, each tailored to approximate the model, optimal policy, and optimal value function, respectively. This work investigates the potential hierarchy of representation complexity -- the complexity of functions to be represented -- among these RL paradigms. We first demonstrate that, for a broad class of Markov decision processes (MDPs), the model can be represented by constant-depth circuits with polynomial size or Multi-Layer Perceptrons (MLPs) with constant layers and polynomial hidden dimension. However, the representation of the optimal policy and optimal value proves to be \(\mathsf\{NP\}\)-complete and unattainable by constant-layer MLPs with polynomial size. This demonstrates a significant representation complexity gap between model-based RL and model-free RL, which includes policy-based RL and value-based RL. To further explore the representation complexity hiera
Authors
(none)
Tags
Stats
Related papers
- The Value-improvement Path: Towards Better Representations For Reinforcement Learning (2020)6.77
- Simplifying Model-based RL: Learning Representations, Latent-space Models, And Policies With One Objective (2022)0.00
- The Value Equivalence Principle For Model-based Reinforcement Learning (2020)0.00
- Spectral Representation-based Reinforcement Learning (2025)0.00
- PC-MLP: Model-based Reinforcement Learning With Policy Cover Guided Exploration (2021)0.00
- Sample-efficient Reinforcement Learning Is Feasible For Linearly Realizable Mdps With Limited Revisiting (2021)0.00
- No Representation, No Trust: Connecting Representation, Collapse, And Trust Issues In PPO (2024)0.00
- Learning Symbolic Representations For Reinforcement Learning Of Non-markovian Behavior (2023)0.00