Multi-Agent Reinforcement Learning Resources Allocation Method Using Dueling Double Deep Q-Network in Vehicular Networks
The communications between vehicle-to-vehicle (V2V) with high frequency, group sending, group receiving and periodic lead to serious collision of wireless resources and limited system capacity, and the rapid channel changes in high mobility vehicular environments preclude the possibility of collecting accurate instantaneous channel state information at the base station for centralized resource management. For the Internet of Vehicles (IoV), it is a fundamental challenge to achieve low latency and high reliability communication for real-time data interaction over short distances in a complex wireless propagation environment, as well as to attenuate and avoid inter-vehicle interference in the region through a reasonable spectrum allocation. To solve the above problems, this paper proposes a resource allocation (RA) method using dueling double deep Q-network reinforcement learning (RL) with low-dimensional fingerprints and soft-update architecture (D3QN-LS) while constructing a multi-agent model based on a Manhattan grid layout urban virtual environment, with communication links between V2V links acting as agents to reuse vehicle-to-infrastructure (V2I) spectrum resources. In addition, we extend the amount of transmitted data in our work, while adding scenarios where spectrum resources are relatively scarce, i.e. the number of V2V links is significantly larger than the amount of spectrum, to compensate for some of the shortcomings in existing literature studies. We demonstrate that the proposed D3QN-LS algorithm leads to a further improvement in the total capacity of V2I links and the success rate of periodic secure message transmission in V2V links.