对电池管理的深度基于学习的预测控制频率调节

论文标题

对电池管理的深度基于学习的预测控制频率调节

Deep Learning-based Predictive Control of Battery Management for Frequency Regulation

论文作者

Li, Yun, Wang, Yixiu, Chen, Yifu, Hua, Kaixun, Ren, Jiayang, Mozafari, Ghazaleh, Lu, Qiugang, Cao, Yankai

论文摘要

本文通过整合模型预测控制（MPC），监督学习（SL），增强学习（RL）和高保真电池模型，提出了针对频率调节（FR）的深度最佳电池管理方案（FR）。通过利用深度神经网络（DNNS），派生的DNN评估策略在在线实施中在计算上是有效的。提出的方案的设计过程由两个顺序过程组成：（1）SL过程，其中首先运行使用MPC嵌入低获力电池模型以生成训练数据集的模拟，然后基于生成的数据集，我们优化了使用SL Algorithms的DNN-APPROXIMAINTAIMENT策略；（2）我们利用RL算法的RL过程来通过平衡短期经济激励措施和长期的电池降解来提高DNN评估政策的性能。 SL过程通过提供良好的初始化来加快随后的RL过程。通过利用RL算法，提出的方案的一个突出属性是，它可以通过模拟高保真电池模拟器上的FR策略来从生成的数据中学习，以调整最初基于低获胜电池电池模型的DNN评估策略。使用FR信号和价格的现实世界数据进行了案例研究。仿真结果表明，与常规的MPC方案相比，提出的基于深度学习的计划可以有效地实现较高的经济利益，同时保持较低的在线计算成本。

This paper proposes a deep learning-based optimal battery management scheme for frequency regulation (FR) by integrating model predictive control (MPC), supervised learning (SL), reinforcement learning (RL), and high-fidelity battery models. By taking advantage of deep neural networks (DNNs), the derived DNN-approximated policy is computationally efficient in online implementation. The design procedure of the proposed scheme consists of two sequential processes: (1) the SL process, in which we first run a simulation with an MPC embedding a low-fidelity battery model to generate a training data set, and then, based on the generated data set, we optimize a DNN-approximated policy using SL algorithms; and (2) the RL process, in which we utilize RL algorithms to improve the performance of the DNN-approximated policy by balancing short-term economic incentives and long-term battery degradation. The SL process speeds up the subsequent RL process by providing a good initialization. By utilizing RL algorithms, one prominent property of the proposed scheme is that it can learn from the data generated by simulating the FR policy on the high-fidelity battery simulator to adjust the DNN-approximated policy, which is originally based on low-fidelity battery model. A case study using real-world data of FR signals and prices is performed. Simulation results show that, compared to conventional MPC schemes, the proposed deep learning-based scheme can effectively achieve higher economic benefits of FR participation while maintaining lower online computational cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题