论文标题
使用梯度增强树合奏的多标映12铅心电图分类
Multilabel 12-Lead Electrocardiogram Classification Using Gradient Boosting Tree Ensemble
论文作者
论文摘要
12铅心电图(ECG)是一种常用的工具,用于检测心脏异常,例如房颤,块和不规则复合物。对于Physionet/CINC 2020挑战,我们使用安装在形态和信号处理功能上的梯度增强的树种合奏来构建了一种算法,以对ECG诊断进行分类。 对于每个引线,我们从心率变异性,PQRST模板形状和完整的信号波形中得出特征。我们加入所有12条线索的功能,适合梯度增强决策树的集合,以预测每个类别属于每个类的ECG实例的概率。我们训练一组特征重要性确定模型,以隔离第二阶段诊断预测模型中最重要的1000个最重要的功能。我们通过将43,101条记录的数据集分成100个独立运行,为85:15培训/验证分配,我们使用重复的随机子采样,以进行内部评估结果。 我们的方法论使我们的官方阶段验证集得分为0.476,测试集得分为-0.080,在团队名称CVC下为-0.080,在排名中为41分之一。
The 12-lead electrocardiogram (ECG) is a commonly used tool for detecting cardiac abnormalities such as atrial fibrillation, blocks, and irregular complexes. For the PhysioNet/CinC 2020 Challenge, we built an algorithm using gradient boosted tree ensembles fitted on morphology and signal processing features to classify ECG diagnosis. For each lead, we derive features from heart rate variability, PQRST template shape, and the full signal waveform. We join the features of all 12 leads to fit an ensemble of gradient boosting decision trees to predict probabilities of ECG instances belonging to each class. We train a phase one set of feature importance determining models to isolate the top 1,000 most important features to use in our phase two diagnosis prediction models. We use repeated random sub-sampling by splitting our dataset of 43,101 records into 100 independent runs of 85:15 training/validation splits for our internal evaluation results. Our methodology generates us an official phase validation set score of 0.476 and test set score of -0.080 under the team name, CVC, placing us 36 out of 41 in the rankings.