Managing Temporal Resolution In Continuous Value Estimation: A Fundamental Trade-off
2022 Β· Zichen Zhang, Johannes Kirschner, Junxi Zhang, et al.
Abstract
A default assumption in reinforcement learning (RL) and optimal control is that observations arrive at discrete time points on a fixed clock cycle. Yet, many applications involve continuous-time systems where the time discretization, in principle, can be managed. The impact of time discretization on RL methods has not been fully characterized in existing theory, but a more detailed analysis of its effect could reveal opportunities for improving data-efficiency. We address this gap by analyzing Monte-Carlo policy evaluation for LQR systems and uncover a fundamental trade-off between approximation and statistical error in value estimation. Importantly, these two errors behave differently to time discretization, leading to an optimal choice of temporal resolution for a given data budget. These findings show that managing the temporal resolution can provably improve policy evaluation efficiency in LQR systems with finite data. Empirically, we demonstrate the trade-off in numerical simulati
Authors
(none)
Tags
Stats
Related papers
- An Idiosyncrasy Of Time-discretization In Reinforcement Learning (2024)0.00
- When To Sense And Control? A Time-adaptive Approach For Continuous-time RL (2024)0.00
- Prediction And Control In Continual Reinforcement Learning (2023)0.00
- ACERAC: Efficient Reinforcement Learning In Fine Time Discretization (2021)4.52
- Policy Optimization For Continuous Reinforcement Learning (2023)2.26
- Least-squares Temporal Difference Learning For The Linear Quadratic Regulator (2017)0.00
- Continuous-time Value Iteration For Multi-agent Reinforcement Learning (2025)0.00
- Sublinear Regret For A Class Of Continuous-time Linear-quadratic Reinforcement Learning Problems (2024)0.00