实时模型校准，并深入增强学习

论文标题

实时模型校准，并深入增强学习

Real-Time Model Calibration with Deep Reinforcement Learning

论文作者

Tian, Yuan, Chao, Manuel Arias, Kulkarni, Chetan, Goebel, Kai, Fink, Olga

论文摘要

从经验数据中对模型参数的动态，实时和准确的推断非常重要，在许多科学和工程学科中，使用计算模型（例如数字双胞胎）来分析和预测复杂的物理过程。但是，在嘈杂的现实世界条件下，使用最新的方法可以轻松实现具有较大和高维数据集的过程的快速准确推断。主要原因是，基于优化或采样的传统技术的模型参数通常遭受计算和统计挑战的限制，从而导致准确性和部署时间之间的权衡。在本文中，我们提出了一个基于强化学习的模型参数推断的新型框架。本文的贡献是双重的：1）我们将推论问题重新制定为跟踪问题，目的是学习迫使基于物理模型的响应遵循观察的政策； 2）我们提出了受约束的基于Lyapunov的参与者 - 批评（CLAC）算法，以实现在嘈杂的现实世界中实时对基于物理的模型参数的强大而准确的推断。在两个基于模型的诊断测试案例上，使用了两种基于物理学的涡轮增压引擎模型，对提出的方法进行了评估和评估。将方法的性能与两种替代方法的性能进行了比较：一种状态更新方法（无知的卡尔曼过滤器）和具有深层神经网络的端到端映射。实验结果表明，所提出的方法在速度和鲁棒性方面优于所有其他测试的方法，其推理精度很高。

The dynamic, real-time, and accurate inference of model parameters from empirical data is of great importance in many scientific and engineering disciplines that use computational models (such as a digital twin) for the analysis and prediction of complex physical processes. However, fast and accurate inference for processes with large and high dimensional datasets cannot easily be achieved with state-of-the-art methods under noisy real-world conditions. The primary reason is that the inference of model parameters with traditional techniques based on optimisation or sampling often suffers from computational and statistical challenges, resulting in a trade-off between accuracy and deployment time. In this paper, we propose a novel framework for inference of model parameters based on reinforcement learning. The contribution of the paper is twofold: 1) We reformulate the inference problem as a tracking problem with the objective of learning a policy that forces the response of the physics-based model to follow the observations; 2) We propose the constrained Lyapunov-based actor-critic (CLAC) algorithm to enable the robust and accurate inference of physics-based model parameters in real time under noisy real-world conditions. The proposed methodology is demonstrated and evaluated on two model-based diagnostics test cases utilizing two different physics-based models of turbofan engines. The performance of the methodology is compared to that of two alternative approaches: a state update method (unscented Kalman filter) and a supervised end-to-end mapping with deep neural networks. The experimental results demonstrate that the proposed methodology outperforms all other tested methods in terms of speed and robustness, with high inference accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题