Interpretable Option Discovery Using Deep Q-learning And Variational Autoencoders
2022 Β· Per-Arne Andersen, Ole-Christoffer Granmo, Morten Goodwin
Abstract
Deep Reinforcement Learning (RL) is unquestionably a robust framework to train autonomous agents in a wide variety of disciplines. However, traditional deep and shallow model-free RL algorithms suffer from low sample efficiency and inadequate generalization for sparse state spaces. The options framework with temporal abstractions is perhaps the most promising method to solve these problems, but it still has noticeable shortcomings. It only guarantees local convergence, and it is challenging to automate initiation and termination conditions, which in practice are commonly hand-crafted. Our proposal, the Deep Variational Q-Network (DVQN), combines deep generative- and reinforcement learning. The algorithm finds good policies from a Gaussian distributed latent-space, which is especially useful for defining options. The DVQN algorithm uses MSE with KL-divergence as regularization, combined with traditional Q-Learning updates. The algorithm learns a latent-space that represents good polic
Authors
(none)
Tags
Stats
Related papers
- Classifying Options For Deep Reinforcement Learning (2016)0.00
- Langevin DQN (2020)0.00
- VIME: Variational Information Maximizing Exploration (2016)0.00
- Off-policy Deep Reinforcement Learning With Analogous Disentangled Exploration (2020)0.00
- DVPO: Distributional Value Modeling-based Policy Optimization For LLM Post-training (2026)0.00
- Resource Governance In Networked Systems Via Integrated Variational Autoencoders And Reinforcement Learning (2024)0.00
- Varibad: A Very Good Method For Bayes-adaptive Deep RL Via Meta-learning (2019)0.00
- Approximating Two Value Functions Instead Of One: Towards Characterizing A New Family Of Deep Reinforcement Learning Algorithms (2019)0.00