论文标题
所有包含数据的深度学习模型,以预测重症监护III数据库的医学信息MART中的关键事件(MIMIC III)
All Data Inclusive, Deep Learning Models to Predict Critical Events in the Medical Information Mart for Intensive Care III Database (MIMIC III)
论文作者
论文摘要
重症监护临床医生需要可靠的临床实践工具来抢占意外的关键事件,这些事件可能会伤害其重症监护病房(ICU),预先计划的及时干预措施,并使患者的家人得到充分了解。传统的统计模型是通过仅策划有限数量的关键变量来构建的,这意味着大量未知的潜在宝贵数据仍然没有使用。可以利用深度学习模型(DLM)从大型复杂数据集中学习并构建预测性临床工具。这项回顾性研究是使用42,818例涉及35,348例患者的住院治疗进行的,这是MIMIC-III数据集的子集。使用自然语言处理(NLP)技术来构建DLM,以预测住院死亡率(IHM)和住院时间> = 7天(LOS)。处理了多个数据源的超过7500万事件,导致超过3.55亿个令牌。使用来自所有源(AS)和图表数据(CS)的数据预测IHM的DLM分别达到0.9178和0.9029,PR-AUC分别为0.6251和0.5701。使用AS和CS预测LOS的DLM分别达到0.8806和0.8642,PR-AUC分别为0.6821和0.6575。发现模型之间观察到的AUC-ROC差异在p = 0.05时对IHM和LO都显着。发现模型之间观察到的PR-AUC差异对于IHM很显着,并且在p = 0.05时对LOS的统计学无关。在这项研究中,使用来自电子健康记录(EHR)中各种来源的数据(例如图表数据,输入和输出事件,实验室值,微生物学事件,程序,注释和处方)构建了深度学习模型。使用所有数据源建立的模型,可以以更好的信心和更高的可靠性来预测院内死亡率。
Intensive care clinicians need reliable clinical practice tools to preempt unexpected critical events that might harm their patients in intensive care units (ICU), to pre-plan timely interventions, and to keep the patient's family well informed. The conventional statistical models are built by curating only a limited number of key variables, which means a vast unknown amount of potentially precious data remains unused. Deep learning models (DLMs) can be leveraged to learn from large complex datasets and construct predictive clinical tools. This retrospective study was performed using 42,818 hospital admissions involving 35,348 patients, which is a subset of the MIMIC-III dataset. Natural language processing (NLP) techniques were applied to build DLMs to predict in-hospital mortality (IHM) and length of stay >=7 days (LOS). Over 75 million events across multiple data sources were processed, resulting in over 355 million tokens. DLMs for predicting IHM using data from all sources (AS) and chart data (CS) achieved an AUC-ROC of 0.9178 and 0.9029, respectively, and PR-AUC of 0.6251 and 0.5701, respectively. DLMs for predicting LOS using AS and CS achieved an AUC-ROC of 0.8806 and 0.8642, respectively, and PR-AUC of 0.6821 and 0.6575, respectively. The observed AUC-ROC difference between models was found to be significant for both IHM and LOS at p=0.05. The observed PR-AUC difference between the models was found to be significant for IHM and statistically insignificant for LOS at p=0.05. In this study, deep learning models were constructed using data combined from a variety of sources in Electronic Health Records (EHRs) such as chart data, input and output events, laboratory values, microbiology events, procedures, notes, and prescriptions. It is possible to predict in-hospital mortality with much better confidence and higher reliability from models built using all sources of data.