Dynamic Spectrum Aggregation and Access for Rescue Cognitive Networks Using Multi-Agent Actor-Critic Reinforcement Learning

Abstract

Unmanned aerial vehicles (UAVs) based emergency communication is gaining more attention for disaster rescue, where the UAV can work as an aerial base station to provide spectrum access in areas with power outages or network interruptions. However, UAVs can only provide limited spectrum access for a short period of time due to limited battery energy. Aiming to providing spectrum access to more users under spectrum and energy limitations, a new dynamic spectrum aggregation and access scheme is proposed for rescue cognitive networks (RCNs), in which different kinds of users are modeled as primary users (PUs) and secondary users (SUs). In each time slot, SUs with different spectrum sensing capabilities, spectrum aggregation capabilities, and bandwidth demands need to sense and aggregate multiple spectrum slots that are not occupied by PUs for data transmission. Based on selected channel states, the reward is designed as a function of transmission rates and is fed back to SUs. A maximum entropy based multi-agent actor-critic (ME-MAAC) algorithm is proposed to optimize the action selection policy for SUs. Simulation results show that the proposed ME-MAAC algorithm can effectively achieve higher achievable rate, lower mean collision rate and lower power consumption, than state-of-the-art methods based on Deep Q-Network (DQN) algorithm. The proposed RCNs provide a novel spectrum usage paradigm for emergency communication and rescue networks.