合作或不合作：与多军强盗进行转移学习，以进行Wi-Fi中的空间再利用

论文标题

合作或不合作：与多军强盗进行转移学习，以进行Wi-Fi中的空间再利用

Cooperate or not Cooperate: Transfer Learning with Multi-Armed Bandit for Spatial Reuse in Wi-Fi

论文作者

Iturria-Rivera, Pedro Enrique, Chenier, Marcel, Herscovici, Bernard, Kantarci, Burak, Erol-Kantarci, Melike

论文摘要

无线设备的指数增长具有高度要求的服务，例如流媒体视频，游戏和其他人对无线局域网（WLAN）构成了一些挑战。在Wi-Fi的背景下，IEEE 802.11AX带来了密集的用户部署的高数据率。此外，它在物理层中具有新的灵活特征，作为动态清晰通道评估（CCA）阈值，目的是改善空间重复使用（SR），以响应密集场景中的无线电光谱稀缺性。在本文中，我们制定了传输功率（TP）和CCA配置问题，目的是最大化公平和最大程度地减少车站饥饿。我们使用多力代理多臂匪徒（MAMAB）提出了分布式SR优化的四个主要贡献。首先，我们建议鉴于TP和CCA阈值每个接入点（AP）的作用组合的较大基础，以减少动作空间。其次，我们提出了两个深度多代理的上下文mAb（MA-CMAB），称为样本平均不确定性（SAU）-Coop和Sau-Noncoop，是合作和非合作版本，以改善SR。此外，我们还提供了使用基于电子烟草，上限置信度（UCB）和汤普森技术的MA-MABS解决方案有益的分析。最后，我们提出了一种深厚的加固转移学习技术，以改善动态环境中的适应性。仿真结果表明，与没有合作方法相比，通过SAU-Coop算法的合作累积吞吐量提高了14.7％，PLR的提高32.5％。最后，在动态的情况下，转移学习有助于减轻服务量的减少，至少有60％的用户总数。

The exponential increase of wireless devices with highly demanding services such as streaming video, gaming and others has imposed several challenges to Wireless Local Area Networks (WLANs). In the context of Wi-Fi, IEEE 802.11ax brings high-data rates in dense user deployments. Additionally, it comes with new flexible features in the physical layer as dynamic Clear-Channel-Assessment (CCA) threshold with the goal of improving spatial reuse (SR) in response to radio spectrum scarcity in dense scenarios. In this paper, we formulate the Transmission Power (TP) and CCA configuration problem with an objective of maximizing fairness and minimizing station starvation. We present four main contributions into distributed SR optimization using Multi-Agent Multi-Armed Bandits (MAMABs). First, we propose to reduce the action space given the large cardinality of action combination of TP and CCA threshold values per Access Point (AP). Second, we present two deep Multi-Agent Contextual MABs (MA-CMABs), named Sample Average Uncertainty (SAU)-Coop and SAU-NonCoop as cooperative and non-cooperative versions to improve SR. In addition, we present an analysis whether cooperation is beneficial using MA-MABs solutions based on the e-greedy, Upper Bound Confidence (UCB) and Thompson techniques. Finally, we propose a deep reinforcement transfer learning technique to improve adaptability in dynamic environments. Simulation results show that cooperation via SAU-Coop algorithm contributes to an improvement of 14.7% in cumulative throughput, and 32.5% improvement of PLR when compared with no cooperation approaches. Finally, under dynamic scenarios, transfer learning contributes to mitigation of service drops for at least 60% of the total of users.

下载PDF全文

下载文献需遵守相关版权规定

论文标题