Logarithmic Regret For Nonlinear Control
2025 Β· James Wang, Bruce D. Lee, Ingvar Ziemann, et al.
Abstract
We address the problem of learning to control an unknown nonlinear dynamical system through sequential interactions. Motivated by high-stakes applications in which mistakes can be catastrophic, such as robotics and healthcare, we study situations where it is possible for fast sequential learning to occur. Fast sequential learning is characterized by the ability of the learning agent to incur logarithmic regret relative to a fully-informed baseline. We demonstrate that fast sequential learning is achievable in a diverse class of continuous control problems where the system dynamics depend smoothly on unknown parameters, provided the optimal control policy is persistently exciting. Additionally, we derive a regret bound which grows with the square root of the number of interactions for cases where the optimal policy is not persistently exciting. Our results provide the first regret bounds for controlling nonlinear dynamical systems depending nonlinearly on unknown parameters. We validate
Authors
(none)
Tags
Stats
Related papers
- Logarithmic Regret For Episodic Continuous-time Linear-quadratic Reinforcement Learning Over A Finite-time Horizon (2020)7.81
- Implications Of Regret On Stability Of Linear Dynamical Systems (2022)6.34
- Sublinear Regret For A Class Of Continuous-time Linear-quadratic Reinforcement Learning Problems (2024)0.00
- Scalable Regret For Learning To Control Network-coupled Subsystems With Unknown Dynamics (2021)0.00
- Foundations Of Safe Online Reinforcement Learning In The Linear Quadratic Regulator: \(\sqrt{t}\)-regret (2025)0.00
- Regret Bounds For Episodic Risk-sensitive Linear Quadratic Regulator (2024)0.00
- Online Policy Gradient For Model Free Learning Of Linear Quadratic Regulators With \(\sqrt{t}\) Regret (2021)0.00
- First-order Regret In Reinforcement Learning With Linear Function Approximation: A Robust Estimation Approach (2021)0.00