Hybrid Belief Reinforcement Learning For Efficient Coordinated Spatial Exploration
2026 Β· Danish Rizvi, David Boyle
Abstract
Coordinating multiple autonomous agents to explore and serve spatially heterogeneous demand requires jointly learning unknown spatial patterns and planning trajectories that maximize task performance. Pure model-based approaches provide structured uncertainty estimates but lack adaptive policy learning, while deep reinforcement learning often suffers from poor sample efficiency when spatial priors are absent. This paper presents a hybrid belief-reinforcement learning (HBRL) framework to address this gap. In the first phase, agents construct spatial beliefs using a Log-Gaussian Cox Process (LGCP) and execute information-driven trajectories guided by a Pathwise Mutual Information (PathMI) planner with multi-step lookahead. In the second phase, trajectory control is transferred to a Soft Actor-Critic (SAC) agent, warm-started through dual-channel knowledge transfer: belief state initialization supplies spatial uncertainty, and replay buffer seeding provides demonstration trajectories gene
Authors
(none)
Tags
Stats
Related papers
- Deep Multi-agent Reinforcement Learning With Discrete-continuous Hybrid Action Spaces (2019)12.47
- Multi-agent Cooperative Games Using Belief Map Assisted Training (2024)0.00
- Coordinated Exploration Via Intrinsic Rewards For Multi-agent Reinforcement Learning (2019)0.00
- Deep Multi-agent Reinforcement Learning With Hybrid Action Spaces Based On Maximum Entropy (2022)0.00
- A Further Exploration Of Deep Multi-agent Reinforcement Learning With Hybrid Action Space (2022)5.84
- Learning Complex Spatial Behaviours In ABM: An Experimental Observational Study (2022)0.00
- Efficient Model-based Reinforcement Learning Through Optimistic Policy Search And Planning (2020)0.00
- Efficient Model-based Multi-agent Reinforcement Learning Via Optimistic Equilibrium Computation (2022)0.00