一种新的半监督自我训练方法，用于肺癌预测

论文标题

一种新的半监督自我训练方法，用于肺癌预测

A new semi-supervised self-training method for lung cancer prediction

论文作者

Shak, Kelvin, Al-Shabi, Mundher, Liew, Andrea, Lan, Boon Leong, Chan, Wai Yee, Ng, Kwan Hoong, Tan, Maxine

论文摘要

背景和客观：早期发现肺癌至关重要，因为它的死亡率很高，患者通常在第3阶段及以上患有该疾病。从计算机断层扫描（CT）扫描中同时检测和分类结节的方法相对较少。此外，很少有研究将半监督的学习用于肺癌预测。这项研究提出了一种完整的端到端方案，以使用嘈杂的学生方法在全面的CT肺筛查数据集上使用最先进的自我训练来检测和分类肺结节。方法：我们使用了三个数据集，即LUNA16，LIDC和NLST。我们首先利用三维深卷积神经网络模型来检测检测阶段的肺结节。被称为Maxout Local-Global网络的分类模型使用非本地网络来检测全局特征，包括形状特征，残留块，以检测包括Nodule纹理的本地特征和Maxout层来检测Nodule变化。我们培训了第一个使用嘈杂的学生模型的自我训练，以预测未标记的NLST数据集上的肺癌。然后，我们进行了混合正则化以增强我们的方案并为错误标签提供鲁棒性。结果和结论：我们的新混音Maxout Local-Global网络从NLST数据集中获得了2,005个完全独立的测试扫描的AUC为0.87。我们的新方案在使用DELONG的测试（p = 0.0001）下，在5％的显着性水平下，在5％的显着性水平上显着优于下一个最高性能的方法（p = 0.0001）。这项研究提出了一种新的完整端到端方案，以使用嘈杂的学生和混合正则化的自我训练来预测肺癌。在完全独立的2,005次扫描数据集中，即使与其他方法相比，我们也获得了最先进的性能。

Background and Objective: Early detection of lung cancer is crucial as it has high mortality rate with patients commonly present with the disease at stage 3 and above. There are only relatively few methods that simultaneously detect and classify nodules from computed tomography (CT) scans. Furthermore, very few studies have used semi-supervised learning for lung cancer prediction. This study presents a complete end-to-end scheme to detect and classify lung nodules using the state-of-the-art Self-training with Noisy Student method on a comprehensive CT lung screening dataset of around 4,000 CT scans. Methods: We used three datasets, namely LUNA16, LIDC and NLST, for this study. We first utilise a three-dimensional deep convolutional neural network model to detect lung nodules in the detection stage. The classification model known as Maxout Local-Global Network uses non-local networks to detect global features including shape features, residual blocks to detect local features including nodule texture, and a Maxout layer to detect nodule variations. We trained the first Self-training with Noisy Student model to predict lung cancer on the unlabelled NLST datasets. Then, we performed Mixup regularization to enhance our scheme and provide robustness to erroneous labels. Results and Conclusions: Our new Mixup Maxout Local-Global network achieves an AUC of 0.87 on 2,005 completely independent testing scans from the NLST dataset. Our new scheme significantly outperformed the next highest performing method at the 5% significance level using DeLong's test (p = 0.0001). This study presents a new complete end-to-end scheme to predict lung cancer using Self-training with Noisy Student combined with Mixup regularization. On a completely independent dataset of 2,005 scans, we achieved state-of-the-art performance even with more images as compared to other methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题