论文标题
通过6G网络中的人工智能进行智能资源分配政策:集中式或分散?
Toward a Smart Resource Allocation Policy via Artificial Intelligence in 6G Networks: Centralized or Decentralized?
论文作者
论文摘要
在本文中,我们设计了一个新的智能软件定义的无线电访问网络(RAN)体系结构,具有重要属性,例如第六代(6G)无线网络的灵活性和交通意识。特别是,我们考虑了建议的智能软启动模型的分层资源分配框架,其中软件定义的网络(SDN)控制器是框架的第一层也是最重要的一层。该单元通过分布式或集中资源分配体系结构进行动态监视网络选择网络操作类型,以智能执行决策。在本文中,我们的目的是使网络在可实现的数据速率,间接费用和复杂性指标方面更具扩展性和灵活性。为此,我们为提出的基于机器学习的算法引入了一个新的度量,吞吐量的高架开销复杂性(TOC),该算法在这些性能指标之间进行了权衡。特别是,基于TOC的决策是通过深入加强学习(DRL)来解决的,该学习决定了适当的资源分配策略。此外,对于选定的算法,我们采用了软角色批评方法,该方法比其他学习方法更准确,可扩展和健壮。仿真结果表明,与缺乏动态性的固定集中或分布式资源管理方案相比,所提出的智能网络在TOC方面取得了更好的性能。此外,我们提出的算法的表现优于其他最先进的网络设计中采用的传统学习方法。
In this paper, we design a new smart softwaredefined radio access network (RAN) architecture with important properties like flexibility and traffic awareness for sixth generation (6G) wireless networks. In particular, we consider a hierarchical resource allocation framework for the proposed smart soft-RAN model, where the software-defined network (SDN) controller is the first and foremost layer of the framework. This unit dynamically monitors the network to select a network operation type on the basis of distributed or centralized resource allocation architectures to perform decision-making intelligently. In this paper, our aim is to make the network more scalable and more flexible in terms of achievable data rate, overhead, and complexity indicators. To this end, we introduce a new metric, throughput overhead complexity (TOC), for the proposed machine learning-based algorithm, which makes a trade-off between these performance indicators. In particular, the decision making based on TOC is solved via deep reinforcement learning (DRL), which determines an appropriate resource allocation policy. Furthermore, for the selected algorithm, we employ the soft actor-critic method, which is more accurate, scalable, and robust than other learning methods. Simulation results demonstrate that the proposed smart network achieves better performance in terms of TOC compared to fixed centralized or distributed resource management schemes that lack dynamism. Moreover, our proposed algorithm outperforms conventional learning methods employed in other state-of-the-art network designs.