Angrier Birds: Bayesian Reinforcement Learning
2016 Β· Imanol Arrieta Ibarra, Bernardo Ramos, Lars Roemheld
Abstract
We train a reinforcement learner to play a simplified version of the game Angry Birds. The learner is provided with a game state in a manner similar to the output that could be produced by computer vision algorithms. We improve on the efficiency of regular \{\epsilon\}-greedy Q-Learning with linear function approximation through more systematic exploration in Randomized Least Squares Value Iteration (RLSVI), an algorithm that samples its policy from a posterior distribution on optimal policies. With larger state-action spaces, efficient exploration becomes increasingly important, as evidenced by the faster learning in RLSVI.
Authors
(none)
Tags
Stats
Related papers
- Flapai Bird: Training An Agent To Play Flappy Bird Using Reinforcement Learning Techniques (2020)0.00
- Provably Efficient And Agile Randomized Q-learning (2025)0.00
- Constrained Policy Improvement For Safe And Efficient Reinforcement Learning (2018)0.00
- Bayesian Reparameterization Of Reward-conditioned Reinforcement Learning With Energy-based Models (2023)0.00
- Bayesian Exploration Networks (2023)0.00
- Online Bayesian Risk-averse Reinforcement Learning (2025)0.00
- Safe Imitation Learning Via Fast Bayesian Reward Inference From Preferences (2020)0.00
- An Information-theoretic Optimality Principle For Deep Reinforcement Learning (2017)0.00