Towards Adapting Reinforcement Learning Agents To New Tasks: Insights From Q-values
2024 Β· Ashwin Ramaswamy, Ransalu Senanayake
Abstract
While contemporary reinforcement learning research and applications have embraced policy gradient methods as the panacea of solving learning problems, value-based methods can still be useful in many domains as long as we can wrangle with how to exploit them in a sample efficient way. In this paper, we explore the chaotic nature of DQNs in reinforcement learning, while understanding how the information that they retain when trained can be repurposed for adapting a model to different tasks. We start by designing a simple experiment in which we are able to observe the Q-values for each state and action in an environment. Then we train in eight different ways to explore how these training algorithms affect the way that accurate Q-values are learned (or not learned). We tested the adaptability of each trained model when retrained to accomplish a slightly modified task. We then scaled our setup to test the larger problem of an autonomous vehicle at an unprotected intersection. We observed th
Authors
(none)
Tags
Stats
Related papers
- Digi-q: Learning Q-value Functions For Training Device-control Agents (2025)0.00
- Approximating Gradients For Differentiable Quality Diversity In Reinforcement Learning (2022)0.00
- Approximating Two Value Functions Instead Of One: Towards Characterizing A New Family Of Deep Reinforcement Learning Algorithms (2019)0.00
- Rethinking Value Function Learning For Generalization In Reinforcement Learning (2022)0.00
- Seizing Serendipity: Exploiting The Value Of Past Success In Off-policy Actor-critic (2023)0.00
- Dissecting Deep RL With High Update Ratios: Combatting Value Divergence (2024)0.00
- On The Model-based Stochastic Value Gradient For Continuous Reinforcement Learning (2020)0.00
- Modular Multi-objective Deep Reinforcement Learning With Decision Values (2017)10.74