Spectral Normalisation For Deep Reinforcement Learning: An Optimisation Perspective
2021 Β· Florin Gogianu, Tudor Berariu, Mihaela Rosca, et al.
Abstract
Most of the recent deep reinforcement learning advances take an RL-centric perspective and focus on refinements of the training objective. We diverge from this view and show we can recover the performance of these developments not by changing the objective, but by regularising the value-function estimator. Constraining the Lipschitz constant of a single layer using spectral normalisation is sufficient to elevate the performance of a Categorical-DQN agent to that of a more elaborated \rainbow\{\} agent on the challenging Atari domain. We conduct ablation studies to disentangle the various effects normalisation has on the learning dynamics and show that is sufficient to modulate the parameter updates to recover most of the performance of spectral normalisation. These findings hint towards the need to also focus on the neural component and its learning dynamics to tackle the peculiarities of Deep Reinforcement Learning.
Authors
(none)
Tags
Stats
Related papers
- Effects Of Spectral Normalization In Multi-agent Reinforcement Learning (2022)5.24
- Spectral Representation-based Reinforcement Learning (2025)0.00
- Regularization Matters In Policy Optimization (2019)2.68
- Revisiting Rainbow: Promoting More Insightful And Inclusive Deep Reinforcement Learning Research (2020)0.00
- Balancing Interpretability And Performance In Reinforcement Learning: An Adaptive Spectral Based Linear Approach (2025)0.00
- Normalization And Effective Learning Rates In Reinforcement Learning (2024)0.00
- Dissecting Deep RL With High Update Ratios: Combatting Value Divergence (2024)0.00
- XQC: Well-conditioned Optimization Accelerates Deep Reinforcement Learning (2025)0.00