开发可解释的机器学习模型以根据ECG数据检测心律不齐

论文标题

开发可解释的机器学习模型以根据ECG数据检测心律不齐

Development of Interpretable Machine Learning Models to Detect Arrhythmia based on ECG Data

论文作者

Verma, Shourya

论文摘要

心电图（ECG）信号的分析可能会耗时，因为心脏病专家手动执行。因此，通过机器学习（ML）分类的自动化正在越来越多地提出，这将使ML模型学习心跳的特征并检测异常。缺乏解释性阻碍了深度学习在医疗保健中的应用。通过这些模型的解释性，我们将了解机器学习算法如何做出决策以及遵循哪些模式进行分类。本文基于最新模型建立了卷积神经网络（CNN）和长期记忆（LSTM）分类器，并将其性能和解释性与浅分类器进行比较。在这里，利用全局和局部解释性方法来了解整个数据集中因变量和自变量之间的相互作用，并分别检查每个样本中的模型决策。部分依赖图，Shapley添加性解释，置换特征重要性和梯度加权类激活图（Grad-CAM）是对ECG节奏分类的时间序列ML模型实施的四种可解释性技术。特别是，我们利用Grad-CAM，这是一种局部可解释性技术，并检查了其可解释性是否在每个类中正确和错误地分类的ECG Beat之间有所不同。此外，使用K折的交叉验证对分类器进行评估，并将组排除技术，并且我们使用非参数统计测试来检查差异是否显着。发现Grad-CAM是解释拟议CNN和LSTM模型的预测的最有效的可解释性技术。我们得出的结论是，所有高性能分类器在做出预测时都研究了ECG节奏的QRS复合体。

The analysis of electrocardiogram (ECG) signals can be time consuming as it is performed manually by cardiologists. Therefore, automation through machine learning (ML) classification is being increasingly proposed which would allow ML models to learn the features of a heartbeat and detect abnormalities. The lack of interpretability hinders the application of Deep Learning in healthcare. Through interpretability of these models, we would understand how a machine learning algorithm makes its decisions and what patterns are being followed for classification. This thesis builds Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) classifiers based on state-of-the-art models and compares their performance and interpretability to shallow classifiers. Here, both global and local interpretability methods are exploited to understand the interaction between dependent and independent variables across the entire dataset and to examine model decisions in each sample, respectively. Partial Dependence Plots, Shapley Additive Explanations, Permutation Feature Importance, and Gradient Weighted Class Activation Maps (Grad-Cam) are the four interpretability techniques implemented on time-series ML models classifying ECG rhythms. In particular, we exploit Grad-Cam, which is a local interpretability technique and examine whether its interpretability varies between correctly and incorrectly classified ECG beats within each class. Furthermore, the classifiers are evaluated using K-Fold cross-validation and Leave Groups Out techniques, and we use non-parametric statistical testing to examine whether differences are significant. It was found that Grad-CAM was the most effective interpretability technique at explaining predictions of proposed CNN and LSTM models. We concluded that all high performing classifiers looked at the QRS complex of the ECG rhythm when making predictions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题