Logo
Nazad
Amna Kopic, Kenan Turbic, H. Gačanin
3 28. 5. 2023.

On Reward Shaping Methods in Deep Reinforcement Learning for Radio Resource Management in Wireless Networks

This paper provides a comprehensive study on the learning models' power violation, sum-rate performance while taking into consideration power constraint, and computational efficiency in terms of training and execution times over a dynamic wireless channel. We propose a reward shaping method and modify learning models with the output scaling strategy to enforce them to fully respect the power constraints while optimizing the sum-rate performance. The proposed approach reaches close-to-optimal accuracy, i.e., up to 99.15%, while satisfying the predefined power constraint of the base station. Moreover, learning models are shown to be more computationally efficient compared to the traditional algorithm. However, solving the power allocation problem within the Orthogonal Frequency Division Multiplexing (OFDM) symbol duration of $16.7\mu \mathrm{s}$ is a remaining challenge.


Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više