论文标题

使用机器学习和神经网络的中风预测的预测分析方法

A predictive analytics approach for stroke prediction using machine learning and neural networks

论文作者

Dev, Soumyabrata, Wang, Hewei, Nwosu, Chidozie Shamrock, Jain, Nishtha, Veeravalli, Bharadwaj, John, Deepu

论文摘要

中风在社会中的负面影响导致了一致努力改善中风的管理和诊断。随着技术和医疗诊断之间的协同作用,护理人员通过系统地挖掘和存档患者的病历为更好的患者管理创造了机会。因此,研究这些危险因素在患者健康记录中的相互依存关系至关重要,并了解他们对中风预测的相对贡献。本文系统地分析了电子健康记录中的各种因素,以进行有效的中风预测。使用各种统计技术和主要成分分析,我们确定了中风预测的最重要因素。我们得出的结论是,年龄,心脏病,平均葡萄糖水平和高血压是检测患者中风的最重要因素。此外,与使用所有可用的输入功能和其他基准测试算法相比,使用这四个属性的感知神经网络提供了最高的精度和最低的错过率。由于数据集关于中风的发生高度不平衡,因此我们在通过子采样技术创建的平衡数据集上报告了结果。

The negative impact of stroke in society has led to concerted efforts to improve the management and diagnosis of stroke. With an increased synergy between technology and medical diagnosis, caregivers create opportunities for better patient management by systematically mining and archiving the patients' medical records. Therefore, it is vital to study the interdependency of these risk factors in patients' health records and understand their relative contribution to stroke prediction. This paper systematically analyzes the various factors in electronic health records for effective stroke prediction. Using various statistical techniques and principal component analysis, we identify the most important factors for stroke prediction. We conclude that age, heart disease, average glucose level, and hypertension are the most important factors for detecting stroke in patients. Furthermore, a perceptron neural network using these four attributes provides the highest accuracy rate and lowest miss rate compared to using all available input features and other benchmarking algorithms. As the dataset is highly imbalanced concerning the occurrence of stroke, we report our results on a balanced dataset created via sub-sampling techniques.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源