Co-adaptation Of Algorithmic And Implementational Innovations In Inference-based Deep Reinforcement Learning
2021 Β· Hiroki Furuta, Tadashi Kozuno, Tatsuya Matsushima, et al.
Abstract
Recently many algorithms were devised for reinforcement learning (RL) with function approximation. While they have clear algorithmic distinctions, they also have many implementation differences that are algorithm-independent and sometimes under-emphasized. Such mixing of algorithmic novelty and implementation craftsmanship makes rigorous analyses of the sources of performance improvements across algorithms difficult. In this work, we focus on a series of off-policy inference-based actor-critic algorithms -- MPO, AWR, and SAC -- to decouple their algorithmic innovations and implementation decisions. We present unified derivations through a single control-as-inference objective, where we can categorize each algorithm as based on either Expectation-Maximization (EM) or direct Kullback-Leibler (KL) divergence minimization and treat the rest of specifications as implementation details. We performed extensive ablation studies, and identified substantial performance drops whenever implementat
Authors
(none)
Tags
Stats
Related papers
- Discriminator-actor-critic: Addressing Sample Inefficiency And Reward Bias In Adversarial Imitation Learning (2018)0.00
- Efficient Exploration In Deep Reinforcement Learning: A Novel Bayesian Actor-critic Algorithm (2024)0.00
- Actor-critic Policy Optimization In Partially Observable Multiagent Environments (2018)0.00
- Studying The Interplay Between The Actor And Critic Representations In Reinforcement Learning (2025)0.00
- Broad Critic Deep Actor Reinforcement Learning For Continuous Control (2024)0.00
- Unified Algorithms For RL With Decision-estimation Coefficients: PAC, Reward-free, Preference-based Learning, And Beyond (2022)5.24
- On The Mistaken Assumption Of Interchangeable Deep Reinforcement Learning Implementations (2025)0.00
- Recursive Least Squares Advantage Actor-critic Algorithms (2022)0.00