优化驱动的深钢筋学习，用于在IRS辅助无线通信中进行稳健的光束成形

论文标题

优化驱动的深钢筋学习，用于在IRS辅助无线通信中进行稳健的光束成形

Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications

论文作者

Lin, Jiaye, Zou, Yuze, Dong, Xiaoru, Gong, Shimin, Hoang, Dinh Thai, Niyato, Dusit

论文摘要

智能反射表面（IRS）是一项有前途的技术，可以协助从多室内访问点（AP）到接收器的下行链路信息传输。在本文中，我们通过优化AP的主动光束形成和IRS的被动光束成形来最大程度地降低AP的发射功率。由于渠道条件不确定，我们制定了一个强大的功率最小化问题，但要受接收人的信噪比（SNR）要求和IRS的功率预算约束。我们提出了一种深入的增强学习方法（DRL）方法，可以使过去的经验中的波束形成策略。为了提高学习绩效，我们将凸近似作为强大问题的下限，该问题集成到DRL框架中，从而促进了新型优化驱动的深层确定性策略梯度（DDPG）方法。特别是，当DDPG算法生成动作的一部分（例如，被动光束形成）时，我们可以使用基于模型的凸近近似值来更有效地优化动作的另一部分（例如，主动波束形成）。我们的仿真结果表明，与传统的无模型DDPG算法相比，优化驱动的DDPG算法可以显着提高学习率和奖励性能。

Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. In this paper, we minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. Due to uncertain channel conditions, we formulate a robust power minimization problem subject to the receiver's signal-to-noise ratio (SNR) requirement and the IRS's power budget constraint. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences. To improve the learning performance, we derive a convex approximation as a lower bound on the robust problem, which is integrated into the DRL framework and thus promoting a novel optimization-driven deep deterministic policy gradient (DDPG) approach. In particular, when the DDPG algorithm generates a part of the action (e.g., passive beamforming), we can use the model-based convex approximation to optimize the other part (e.g., active beamforming) of the action more efficiently. Our simulation results demonstrate that the optimization-driven DDPG algorithm can improve both the learning rate and reward performance significantly compared to the conventional model-free DDPG algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题