Soft Actor-critic With Inhibitory Networks For Faster Retraining
2022 Β· Jaime S. Ide, Daria MiΔoviΔ, Michael J. Guarino, et al.
Abstract
Reusing previously trained models is critical in deep reinforcement learning to speed up training of new agents. However, it is unclear how to acquire new skills when objectives and constraints are in conflict with previously learned skills. Moreover, when retraining, there is an intrinsic conflict between exploiting what has already been learned and exploring new skills. In soft actor-critic (SAC) methods, a temperature parameter can be dynamically adjusted to weight the action entropy and balance the explore \(\times\) exploit trade-off. However, controlling a single coefficient can be challenging within the context of retraining, even more so when goals are contradictory. In this work, inspired by neuroscience research, we propose a novel approach using inhibitory networks to allow separate and adaptive state value evaluations, as well as distinct automatic entropy tuning. Ultimately, our approach allows for controlling inhibition to handle conflict between exploiting less risky, ac
Authors
(none)
Tags
Stats
Related papers
- Band-limited Soft Actor Critic Model (2020)0.00
- Learning Without Time-based Embodiment Resets In Soft-actor Critic (2025)0.00
- Boosting Soft Actor-critic: Emphasizing Recent Experience Without Forgetting The Past (2019)0.00
- Context-based Soft Actor Critic For Environments With Non-stationary Dynamics (2021)0.00
- Regularized Soft Actor-critic For Behavior Transfer Learning (2022)3.58
- Improved Soft Actor-critic: Mixing Prioritized Off-policy Samples With On-policy Experience (2021)0.00
- Improving Exploration In Soft-actor-critic With Normalizing Flows Policies (2019)0.00
- Soft Actor-critic: Off-policy Maximum Entropy Deep Reinforcement Learning With A Stochastic Actor (2018)0.00