论文标题
Covid-19期间的机器学习方法:案例研究分析
Machine learning approaches for localized lockdown during COVID-19: a case study analysis
论文作者
论文摘要
2019年底,最新的新型冠状病毒SARS-COV-2成为一种重要的急性呼吸道疾病,已成为全球大流行。由于国家和市政当局的社会经济差异很大,像巴西这样的国家很难处理该病毒。因此,本研究使用应用于巴西Covid-19数据的不同机器学习和深度学习算法提出了一种新方法。首先,一种聚类算法用于识别具有相似社会人口统计学行为的县,而本福德定律则用于检查数据操纵。基于这些结果,我们能够根据簇正确对Sarima模型进行正确建模,以预测新的每日病例。无监督的机器学习技术优化了定义Sarima模型参数的过程。该框架在所谓的第二波中提出限制方案也很有用。我们使用了来自巴西人口最多的州圣保罗州立大学的645个县。但是,该方法可以在其他州或国家 /地区使用。本文展示了机器学习,深度学习,数据挖掘和统计数据的不同技术如何在处理大流行数据时产生重要的结果。尽管这些发现不能专门用于评估和影响政策决策,但它们为已使用的无效措施提供了替代方案。
At the end of 2019, the latest novel coronavirus Sars-CoV-2 emerged as a significant acute respiratory disease that has become a global pandemic. Countries like Brazil have had difficulty in dealing with the virus due to the high socioeconomic difference of states and municipalities. Therefore, this study presents a new approach using different machine learning and deep learning algorithms applied to Brazilian COVID-19 data. First, a clustering algorithm is used to identify counties with similar sociodemographic behavior, while Benford's law is used to check for data manipulation. Based on these results we are able to correctly model SARIMA models based on the clusters to predict new daily cases. The unsupervised machine learning techniques optimized the process of defining the parameters of the SARIMA model. This framework can also be useful to propose confinement scenarios during the so-called second wave. We have used the 645 counties from São Paulo state, the most populous state in Brazil. However, this methodology can be used in other states or countries. This paper demonstrates how different techniques of machine learning, deep learning, data mining and statistics can be used together to produce important results when dealing with pandemic data. Although the findings cannot be used exclusively to assess and influence policy decisions, they offer an alternative to the ineffective measures that have been used.