论文标题
医疗数据不精确对学习结果的影响
Impact of Medical Data Imprecision on Learning Results
论文作者
论文摘要
由医疗仪器测量的测试数据通常具有不精确的范围,其中包括真实值。几乎在所有情况下都无法获得后者。但是,大多数学习算法都执行算术计算,这些计算在学习过程中受到不确定的影响,以获取学习模型的模型和应用,例如预言。在本文中,我们启动了一项研究不精确对预测的影响的研究,在医疗保健应用中,预先训练的模型用于预测患者的未来甲状腺功能亢进状态。我们为数据不确定制定了模型。使用参数控制不精确程度,可以使用此模型生成不精确的样本进行比较实验。此外,定义了一组措施来定量评估不同的影响。更具体地说,定义了测量单个患者预测不一致的预测的统计数据。我们执行实验评估,以根据原始数据集的数据以及使用长期记忆(LSTM)网络从建议的精确模型产生的相应数据进行比较预测结果。针对现实世界中甲状腺功能亢进的数据集的结果提供了有关小型不确定会导致大量预测结果范围的见解,这可能会导致个人患者的不当行为(治疗或无治疗方法)。
Test data measured by medical instruments often carry imprecise ranges that include the true values. The latter are not obtainable in virtually all cases. Most learning algorithms, however, carry out arithmetical calculations that are subject to uncertain influence in both the learning process to obtain models and applications of the learned models in, e.g. prediction. In this paper, we initiate a study on the impact of imprecision on prediction results in a healthcare application where a pre-trained model is used to predict future state of hyperthyroidism for patients. We formulate a model for data imprecisions. Using parameters to control the degree of imprecision, imprecise samples for comparison experiments can be generated using this model. Further, a group of measures are defined to evaluate the different impacts quantitatively. More specifically, the statistics to measure the inconsistent prediction for individual patients are defined. We perform experimental evaluations to compare prediction results based on the data from the original dataset and the corresponding ones generated from the proposed precision model using the long-short-term memories (LSTM) network. The results against a real world hyperthyroidism dataset provide insights into how small imprecisions can cause large ranges of predicted results, which could cause mis-labeling and inappropriate actions (treatments or no treatments) for individual patients.