论文标题

超声心动图中人口统计学生物标志物的深度学习发现

Deep Learning Discovery of Demographic Biomarkers in Echocardiography

论文作者

Duffy, Grant, Clarke, Shoa L., Christensen, Matthew, He, Bryan, Yuan, Neal, Cheng, Susan, Ouyang, David

论文摘要

深度学习已被证明可以准确评估“隐藏”表型,并从传统临床医生对医学成像的解释之外的医学成像中预测生物标志物。鉴于人工智能(AI)模型的黑匣子性质,应在将模型应用于医疗保健时谨慎,因为预测任务可能会因疾病和患者人群的人口统计学差异而短。使用来自两个医疗保健系统的大超声心动图数据集,我们测试使用深度学习算法从心脏超声图像中预测年龄,种族和性别,并评估各种混杂变量的影响。我们培训了基于视频的卷积神经网络,以预测年龄,性别和种族。我们发现,深度学习模型能够确定年龄和性别,而无法可靠地预测种族。不考虑类别之间的混杂差异,AI模型预测性别的AUC为0.85(95%CI 0.84-0.86),年龄为9.12年的平均绝对误差(95%CI 9.00-9.25),而AUC的AUC为0.63-0.63-0.71。在预测种族时,我们表明,在训练数据中调整混杂变量(性别)的比例会显着影响AUC模型(从0.57到0.84),而在训练性别预测模型中,调整混杂因素(种族)并没有实质性改变AUC(0.81-0.83)。这表明该模型在预测种族方面的表现中很大一部分可能来自AI检测到的混杂功能。进一步的工作仍然是确定与人口统计信息相关的特定成像特征,并更好地了解医学AI中人口统计学识别的风险,因为它与潜在的偏见和差异有关。

Deep learning has been shown to accurately assess 'hidden' phenotypes and predict biomarkers from medical imaging beyond traditional clinician interpretation of medical imaging. Given the black box nature of artificial intelligence (AI) models, caution should be exercised in applying models to healthcare as prediction tasks might be short-cut by differences in demographics across disease and patient populations. Using large echocardiography datasets from two healthcare systems, we test whether it is possible to predict age, race, and sex from cardiac ultrasound images using deep learning algorithms and assess the impact of varying confounding variables. We trained video-based convolutional neural networks to predict age, sex, and race. We found that deep learning models were able to identify age and sex, while unable to reliably predict race. Without considering confounding differences between categories, the AI model predicted sex with an AUC of 0.85 (95% CI 0.84 - 0.86), age with a mean absolute error of 9.12 years (95% CI 9.00 - 9.25), and race with AUCs ranging from 0.63 - 0.71. When predicting race, we show that tuning the proportion of a confounding variable (sex) in the training data significantly impacts model AUC (ranging from 0.57 to 0.84), while in training a sex prediction model, tuning a confounder (race) did not substantially change AUC (0.81 - 0.83). This suggests a significant proportion of the model's performance on predicting race could come from confounding features being detected by AI. Further work remains to identify the particular imaging features that associate with demographic information and to better understand the risks of demographic identification in medical AI as it pertains to potentially perpetuating bias and disparities.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源