学习与专家的残留混合物适应临床序列

论文标题

学习与专家的残留混合物适应临床序列

Learning to Adapt Clinical Sequences with Residual Mixture of Experts

论文作者

Lee, Jeong Min, Hauskrecht, Milos

论文摘要

电子健康记录（EHRS）中的临床事件序列记录了有关患者状况和患者护理的详细信息。近年来，机器学习社区对开发机器学习模型的兴趣越来越大，解决了EHR中信息定义的不同类型的问题。最近，诸如RNN和LSTM之类的神经顺序模型变得流行并广泛应用模型，用于代表患者序列数据并根据此类数据预测未来事件或结果。但是，单个神经顺序模型可能无法正确代表所有患者的复杂动力学及其行为差异。在这项工作中，我们旨在通过使用Experts（MOE）架构的混合物来完善一个合适的模型来减轻这种限制。该体系结构由多个（专家）RNN模型组成，这些模型涵盖了患者的子人群并完善基本模型的预测。也就是说，不是从头开始训练专家RNN模型，而是根据剩余信号定义它们，该信号试图模拟与人口范围的模型的差异。各种患者序列的异质性是通过由RNN组成的多个专家建模的。特别是，我们不是从头开始训练MOE，而是根据预验证的基本GRU模型的预测信号增强MOE。通过这种方式，专家的混合物可以为单个基础RNN模型的（有限）预测能力提供灵活的适应性。我们在现实世界EHRS数据和多元临床事件预测任务上尝试新提出的模型。我们使用封闭式复发单元（GRU）实施RNN。与单个RNN预测相比，AUPRC统计数据的增长率为4.1％。

Clinical event sequences in Electronic Health Records (EHRs) record detailed information about the patient condition and patient care as they occur in time. Recent years have witnessed increased interest of machine learning community in developing machine learning models solving different types of problems defined upon information in EHRs. More recently, neural sequential models, such as RNN and LSTM, became popular and widely applied models for representing patient sequence data and for predicting future events or outcomes based on such data. However, a single neural sequential model may not properly represent complex dynamics of all patients and the differences in their behaviors. In this work, we aim to alleviate this limitation by refining a one-fits-all model using a Mixture-of-Experts (MoE) architecture. The architecture consists of multiple (expert) RNN models covering patient sub-populations and refining the predictions of the base model. That is, instead of training expert RNN models from scratch we define them on the residual signal that attempts to model the differences from the population-wide model. The heterogeneity of various patient sequences is modeled through multiple experts that consist of RNN. Particularly, instead of directly training MoE from scratch, we augment MoE based on the prediction signal from pretrained base GRU model. With this way, the mixture of experts can provide flexible adaptation to the (limited) predictive power of the single base RNN model. We experiment with the newly proposed model on real-world EHRs data and the multivariate clinical event prediction task. We implement RNN using Gated Recurrent Units (GRU). We show 4.1% gain on AUPRC statistics compared to a single RNN prediction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题