论文标题
5G无线电访问网络切片中的动态资源优化的强化学习
Reinforcement Learning for Dynamic Resource Optimization in 5G Radio Access Network Slicing
论文作者
论文摘要
该论文为5G无线电访问网络切片的动态资源分配提供了增强学习解决方案。可用的通信资源(频率时间块和传输功率)和计算资源(处理器用法)分配给网络切片请求的随机到达。每个请求都以优先级(权重),吞吐量,计算资源和延迟(截止日期)要求到达,如果可行,则将其提供可用的通信和计算资源,并在其请求的持续时间内分配。由于资源分配的每个决定使得一些资源在将来暂时无法使用,因此仅能优化当前资源分配的近视解决方案对于网络切片而变得无效。因此,提出了Q学习解决方案,以最大程度地利用受通信和计算约束的时间范围内的授予网络切片请求的总权重来最大化网络实用程序。结果表明,相对于近视,随机和首先提供的解决方案,增强学习提供了5G网络实用程序的重大改进。尽管随着服务用户的数量的增加,增强学习可以维持可扩展的性能,但当5G需要与现任用户共享频谱时,它也可以有效地将资源分配给网络切片,这些用户可能会动态占据某些频率时间块。
The paper presents a reinforcement learning solution to dynamic resource allocation for 5G radio access network slicing. Available communication resources (frequency-time blocks and transmit powers) and computational resources (processor usage) are allocated to stochastic arrivals of network slice requests. Each request arrives with priority (weight), throughput, computational resource, and latency (deadline) requirements, and if feasible, it is served with available communication and computational resources allocated over its requested duration. As each decision of resource allocation makes some of the resources temporarily unavailable for future, the myopic solution that can optimize only the current resource allocation becomes ineffective for network slicing. Therefore, a Q-learning solution is presented to maximize the network utility in terms of the total weight of granted network slicing requests over a time horizon subject to communication and computational constraints. Results show that reinforcement learning provides major improvements in the 5G network utility relative to myopic, random, and first come first served solutions. While reinforcement learning sustains scalable performance as the number of served users increases, it can also be effectively used to assign resources to network slices when 5G needs to share the spectrum with incumbent users that may dynamically occupy some of the frequency-time blocks.