Actively Learning Reinforcement Learning: A Stochastic Optimal Control Approach
2023 Β· Mohammad S. Ramadan, Mahmoud A. Hayajnh, Michael T. Tolley, et al.
Abstract
In this paper we propose a framework towards achieving two intertwined objectives: (i) equipping reinforcement learning with active exploration and deliberate information gathering, such that it regulates state and parameter uncertainties resulting from modeling mismatches and noisy sensory; and (ii) overcoming the computational intractability of stochastic optimal control. We approach both objectives by using reinforcement learning to compute the stochastic optimal control law. On one hand, we avoid the curse of dimensionality prohibiting the direct solution of the stochastic dynamic programming equation. On the other hand, the resulting stochastic optimal control reinforcement learning agent admits caution and probing, that is, optimal online exploration and exploitation. Unlike fixed exploration and exploitation balance, caution and probing are employed automatically by the controller in real-time, even after the learning process is terminated. We conclude the paper with a numerical
Authors
(none)
Tags
Stats
Related papers
- Exploration Versus Exploitation In Reinforcement Learning: A Stochastic Control Approach (2018)9.76
- Optimal Exploration For Model-based RL In Nonlinear Systems (2023)0.00
- An Optimal Policy For Learning Controllable Dynamics By Exploration (2025)0.00
- Stochastic Reinforcement Learning (2019)5.24
- From Reinforcement Learning To Optimal Control: A Unified Framework For Sequential Decisions (2019)0.00
- Learning Optimal Deterministic Policies With Stochastic Policy Gradients (2024)0.00
- A General Markov Decision Process Framework For Directly Learning Optimal Control Policies (2019)0.00
- Guided Exploration In Reinforcement Learning Via Monte Carlo Critic Optimization (2022)0.00