论文标题
MEDFACT:通过功能集群在患者健康表示中学习的医学特征相关性建模
MedFACT: Modeling Medical Feature Correlations in Patient Health Representation Learning via Feature Clustering
论文作者
论文摘要
在医疗保健预测任务中,必须利用医疗特征之间的相关性并学习更好的患者健康表征。现有方法尝试仅根据数据估算特征相关性,或通过引入特定于任务的医学知识来提高估计质量。但是,由于训练样本不足,这种方法要么难以估计特征相关性,要么由于依赖特定知识而无法推广到其他任务。有医学研究表明,并非所有医疗特征都密切相关。因此,为了解决这些问题,我们期望将密切相关的特征分组,并以团体方式学习特征相关性,以降低学习复杂性而不会失去一般性。在本文中,我们提出了一个普通的患者健康表示学习框架。我们通过测量医学特征的时间模式与内核方法之间的相似性,以及分组强相关性的聚类特征来估计相关性。该特征组进一步为相关图制定,我们采用图形卷积网络来进行小组特征相互作用,以更好地表示学习。在两个现实世界数据集上的实验证明了MedFact的优越性。发现的医学发现也得到了文献证实,提供了宝贵的医学见解和解释。
In healthcare prediction tasks, it is essential to exploit the correlations between medical features and learn better patient health representations. Existing methods try to estimate feature correlations only from data, or increase the quality of estimation by introducing task-specific medical knowledge. However, such methods either are difficult to estimate the feature correlations due to insufficient training samples, or cannot be generalized to other tasks due to reliance on specific knowledge. There are medical research revealing that not all the medical features are strongly correlated. Thus, to address the issues, we expect to group up strongly correlated features and learn feature correlations in a group-wise manner to reduce the learning complexity without losing generality. In this paper, we propose a general patient health representation learning framework MedFACT. We estimate correlations via measuring similarity between temporal patterns of medical features with kernel methods, and cluster features with strong correlations into groups. The feature group is further formulated as a correlation graph, and we employ graph convolutional networks to conduct group-wise feature interactions for better representation learning. Experiments on two real-world datasets demonstrate the superiority of MedFACT. The discovered medical findings are also confirmed by literature, providing valuable medical insights and explanations.