论文标题

COVIDX CT-3:用于计算机辅助COVID-19的大型,跨国,开源基准数据集,从胸部CT图像筛选

COVIDx CT-3: A Large-scale, Multinational, Open-Source Benchmark Dataset for Computer-aided COVID-19 Screening from Chest CT Images

论文作者

Gunraj, Hayden, Tuinstra, Tia, Wong, Alexander

论文摘要

计算机断层扫描(CT)已被广泛探索为共同筛选和评估工具,以补充RT-PCR测试。为了协助放射科医生进行基于CT的COVID-19筛选,已经提出了许多计算机辅助系统。但是,许多提出的系统都是使用CT数据构建的,该数据的数量和多样性都受到限制。积极支持在机器学习驱动的筛查系统开发的努力的动机,我们引入了Covidx CT-3,这是一种大规模的跨国基准数据集,用于从胸部CT图像中检测COVID-19案例。 COVIDX CT-3包括至少17个国家 /地区的6,068名患者的431,205个CT切片,据我们所知,这是开放式形式的COVID-19 CT图像中最大,最多样化的数据集。此外,我们研究了COVIDX CT-3数据集的数据多样性和潜在偏见,发现尽管从各种来源中策划了数据,但仍有重大的地理和阶级失衡仍然存在。

Computed tomography (CT) has been widely explored as a COVID-19 screening and assessment tool to complement RT-PCR testing. To assist radiologists with CT-based COVID-19 screening, a number of computer-aided systems have been proposed. However, many proposed systems are built using CT data which is limited in both quantity and diversity. Motivated to support efforts in the development of machine learning-driven screening systems, we introduce COVIDx CT-3, a large-scale multinational benchmark dataset for detection of COVID-19 cases from chest CT images. COVIDx CT-3 includes 431,205 CT slices from 6,068 patients across at least 17 countries, which to the best of our knowledge represents the largest, most diverse dataset of COVID-19 CT images in open-access form. Additionally, we examine the data diversity and potential biases of the COVIDx CT-3 dataset, finding that significant geographic and class imbalances remain despite efforts to curate data from a wide variety of sources.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源