论文标题
使用单个新生儿数据进行癫痫发作的合奏学习
Ensemble learning using individual neonatal data for seizure detection
论文作者
论文摘要
由于数据保护法和机构内的官方程序,在机构之间共享医疗数据很难。因此,大多数现有的算法都经过相对较小的脑电图(EEG)数据集的培训,这可能不利于预测准确性。在这项工作中,我们通过将公开可用的数据集分为不同机构中数据的不连接集来模拟一个情况。我们建议在每个机构中培训一个(地方)检测器,并将其个人预测汇总为最终预测。比较了四个集合计划,即多数投票,平均值,加权平均值和Dawid-Skene方法。该方法仅使用EEG通道的一个子集在独立的数据集上进行了验证。当每个机构都有足够的数据可用时,该集合的精度与对所有数据进行训练的单个检测器相当。加权平均聚合方案表现出最佳性能,当局部检测器接近对所有可用数据训练的单个检测器的性能时,它仅能用DAWID-SKENE方法表现出色。
Sharing medical data between institutions is difficult in practice due to data protection laws and official procedures within institutions. Therefore, most existing algorithms are trained on relatively small electroencephalogram (EEG) data sets which is likely to be detrimental to prediction accuracy. In this work, we simulate a case when the data can not be shared by splitting the publicly available data set into disjoint sets representing data in individual institutions. We propose to train a (local) detector in each institution and aggregate their individual predictions into one final prediction. Four aggregation schemes are compared, namely, the majority vote, the mean, the weighted mean and the Dawid-Skene method. The method was validated on an independent data set using only a subset of EEG channels. The ensemble reaches accuracy comparable to a single detector trained on all the data when sufficient amount of data is available in each institution. The weighted mean aggregation scheme showed best performance, it was only marginally outperformed by the Dawid--Skene method when local detectors approach performance of a single detector trained on all available data.