Episodic Reinforcement Learning With Expanded State-reward Space
2024 Β· Dayang Liang, Yaru Zhang, Yunlong Liu
Abstract
Empowered by deep neural networks, deep reinforcement learning (DRL) has demonstrated tremendous empirical successes in various domains, including games, health care, and autonomous driving. Despite these advancements, DRL is still identified as data-inefficient as effective policies demand vast numbers of environmental samples. Recently, episodic control (EC)-based model-free DRL methods enable sample efficiency by recalling past experiences from episodic memory. However, existing EC-based methods suffer from the limitation of potential misalignment between the state and reward spaces for neglecting the utilization of (past) retrieval states with extensive information, which probably causes inaccurate value estimation and degraded policy performance. To tackle this issue, we introduce an efficient EC-based DRL framework with expanded state-reward space, where the expanded states used as the input and the expanded rewards used in the training both contain historical and current informa
Authors
(none)
Tags
Stats
Related papers
- Exploration Via Elliptical Episodic Bonuses (2022)3.58
- DEIR: Efficient And Robust Exploration Through Discriminative-model-based Episodic Intrinsic Rewards (2023)0.00
- Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity (2021)0.00
- Decoupled Exploration And Exploitation Policies For Sample-efficient Reinforcement Learning (2021)0.00
- Continuous Episodic Control (2022)2.26
- Sequential Memory Improves Sample And Memory Efficiency In Episodic Control (2021)0.00
- Evolution-guided Policy Gradient In Reinforcement Learning (2018)0.00
- Learning Self-imitating Diverse Policies (2018)0.00