Multi-timescale, Gradient Descent, Temporal Difference Learning With Linear Options
2017 Β· Peeyush Kumar, Doina Precup
Abstract
Deliberating on large or continuous state spaces have been long standing challenges in reinforcement learning. Temporal Abstraction have somewhat made this possible, but efficiently planing using temporal abstraction still remains an issue. Moreover using spatial abstractions to learn policies for various situations at once while using temporal abstraction models is an open problem. We propose here an efficient algorithm which is convergent under linear function approximation while planning using temporally abstract actions. We show how this algorithm can be used along with randomly generated option models over multiple time scales to plan agents which need to act real time. Using these randomly generated option models over multiple time scales are shown to reduce number of decision epochs required to solve the given task, hence effectively reducing the time needed for deliberation.
Authors
(none)
Tags
Stats
Related papers
- Autonomous Option Invention For Continual Hierarchical Reinforcement Learning And Planning (2024)2.26
- Reusable Options Through Gradient-based Meta Learning (2022)0.00
- Hierarchical Deep Multiagent Reinforcement Learning With Temporal Abstraction (2018)0.00
- Fast Two-time-scale Stochastic Gradient Method With Applications In Reinforcement Learning (2024)0.00
- Temporal Abstractions-augmented Temporally Contrastive Learning: An Alternative To The Laplacian In RL (2022)0.00
- Single-timescale Stochastic Nonconvex-concave Optimization For Smooth Nonlinear TD Learning (2020)0.00
- A Hierarchical Reinforcement Learning Method For Persistent Time-sensitive Tasks (2016)0.00
- Finite-sample Analysis Of Decentralized Temporal-difference Learning With Linear Function Approximation (2019)0.00