Exploratory Gradient Boosting For Reinforcement Learning In Complex Domains
2016 Β· David Abel, Alekh Agarwal, Fernando Diaz, et al.
Abstract
High-dimensional observations and complex real-world dynamics present major challenges in reinforcement learning for both function approximation and exploration. We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, non-parametric function approximator for learning on \(Q\)-function residuals. And second, we propose an exploration strategy inspired by the principles of state abstraction and information acquisition under uncertainty. We demonstrate the empirical effectiveness of these techniques, first, as a preliminary check, on two standard tasks (Blackjack and \(n\)-Chain), and then on two much larger and more realistic tasks with high-dimensional observation spaces. Specifically, we introduce two benchmarks built within the game Minecraft where the observations are pixel arrays of the agent's visual field. A combination of our two algorithmic techniques performs competitively on the standard reinforcement-learning tasks w
Authors
(none)
Tags
Stats
Related papers
- Improving Policy Gradient By Exploring Under-appreciated Rewards (2016)0.00
- Multi-task Curriculum Learning In A Complex, Visual, Hard-exploration Domain: Minecraft (2021)0.00
- Mitigating Suboptimality Of Deterministic Policy Gradients In Complex Q-functions (2024)0.00
- Go-explore: A New Approach For Hard-exploration Problems (2019)0.00
- Experience Augmentation: Boosting And Accelerating Off-policy Multi-agent Reinforcement Learning (2020)0.00
- Information-directed Exploration For Deep Reinforcement Learning (2018)0.00
- A Nearly Optimal And Low-switching Algorithm For Reinforcement Learning With General Function Approximation (2023)0.00
- Approximating Gradients For Differentiable Quality Diversity In Reinforcement Learning (2022)0.00