Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning
2021 Β· Tong Mu, Georgios Theocharous, David Arbour, et al.
Abstract
Online reinforcement learning (RL) algorithms are often difficult to deploy in complex human-facing applications as they may learn slowly and have poor early performance. To address this, we introduce a practical algorithm for incorporating human insight to speed learning. Our algorithm, Constraint Sampling Reinforcement Learning (CSRL), incorporates prior domain knowledge as constraints/restrictions on the RL policy. It takes in multiple potential policy constraints to maintain robustness to misspecification of individual constraints while leveraging helpful ones to learn quickly. Given a base RL learning algorithm (ex. UCRL, DQN, Rainbow) we propose an upper confidence with elimination scheme that leverages the relationship between the constraints, and their observed performance, to adaptively switch among them. We instantiate our algorithm with DQN-type algorithms and UCRL as base algorithms, and evaluate our algorithm in four environments, including three simulators based on real d
Authors
(none)
Tags
Stats
Related papers
- Human-inspired Framework To Accelerate Reinforcement Learning (2023)0.00
- Provably Efficient Exploration In Inverse Constrained Reinforcement Learning (2024)0.00
- A Few Expert Queries Suffices For Sample-efficient RL With Resets And Linear Value Approximation (2022)0.00
- Conservative Optimistic Policy Optimization Via Multiple Importance Sampling (2021)0.00
- Robust Offline Reinforcement Learning With Gradient Penalty And Constraint Relaxation (2022)0.00
- Tightening Exploration In Upper Confidence Reinforcement Learning (2020)0.00
- Continuous Action Reinforcement Learning From A Mixture Of Interpretable Experts (2020)0.00
- Computationally Efficient Reinforcement Learning: Targeted Exploration Leveraging Simple Rules (2022)2.26