论文标题

对电池管理的深度基于学习的预测控制频率调节

Deep Learning-based Predictive Control of Battery Management for Frequency Regulation

论文作者

Li, Yun, Wang, Yixiu, Chen, Yifu, Hua, Kaixun, Ren, Jiayang, Mozafari, Ghazaleh, Lu, Qiugang, Cao, Yankai

论文摘要

本文通过整合模型预测控制(MPC),监督学习(SL),增强学习(RL)和高保真电池模型,提出了针对频率调节(FR)的深度最佳电池管理方案(FR)。通过利用深度神经网络(DNNS),派生的DNN评估策略在在线实施中在计算上是有效的。提出的方案的设计过程由两个顺序过程组成:(1)SL过程,其中首先运行使用MPC嵌入低获力电池模型以生成训练数据集的模拟,然后基于生成的数据集,我们优化了使用SL Algorithms的DNN-APPROXIMAINTAIMENT策略; (2)我们利用RL算法的RL过程来通过平衡短期经济激励措施和长期的电池降解来提高DNN评估政策的性能。 SL过程通过提供良好的初始化来加快随后的RL过程。通过利用RL算法,提出的方案的一个突出属性是,它可以通过模拟高保真电池模拟器上的FR策略来从生成的数据中学习,以调整最初基于低获胜电池电池模型的DNN评估策略。使用FR信号和价格的现实世界数据进行了案例研究。仿真结果表明,与常规的MPC方案相比,提出的基于深度学习的计划可以有效地实现较高的经济利益,同时保持较低的在线计算成本。

This paper proposes a deep learning-based optimal battery management scheme for frequency regulation (FR) by integrating model predictive control (MPC), supervised learning (SL), reinforcement learning (RL), and high-fidelity battery models. By taking advantage of deep neural networks (DNNs), the derived DNN-approximated policy is computationally efficient in online implementation. The design procedure of the proposed scheme consists of two sequential processes: (1) the SL process, in which we first run a simulation with an MPC embedding a low-fidelity battery model to generate a training data set, and then, based on the generated data set, we optimize a DNN-approximated policy using SL algorithms; and (2) the RL process, in which we utilize RL algorithms to improve the performance of the DNN-approximated policy by balancing short-term economic incentives and long-term battery degradation. The SL process speeds up the subsequent RL process by providing a good initialization. By utilizing RL algorithms, one prominent property of the proposed scheme is that it can learn from the data generated by simulating the FR policy on the high-fidelity battery simulator to adjust the DNN-approximated policy, which is originally based on low-fidelity battery model. A case study using real-world data of FR signals and prices is performed. Simulation results show that, compared to conventional MPC schemes, the proposed deep learning-based scheme can effectively achieve higher economic benefits of FR participation while maintaining lower online computational cost.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源