论文标题
凝视引导的类激活映射:利用人类注意力在胸部X射线分类中引起网络注意力
Gaze-Guided Class Activation Mapping: Leveraging Human Attention for Network Attention in Chest X-rays Classification
论文作者
论文摘要
眼睛凝视跟踪技术的可用性和准确性提高了与注意力有关的心理学,神经科学以及最近的计算机视觉和人工智能的研究。已知人工神经网络中的注意机制可以改善学习任务。但是,以前没有研究将网络的关注和人类关注融合在一起。本文介绍了一种凝视的类别激活映射(GG-CAM)方法,可以直接根据专家放射科医生对胸部X射线病理学分类问题的视觉关注而直接调节网络注意力的形成,由于图像之间的复杂且通常细微的差异,这仍然具有挑战性。 GG-CAM是一种轻巧(用于调节学习过程的$ 3 $附加可训练的参数)和通用扩展,可以轻松应用于大多数分类卷积神经网络(CNN)。全面训练时,GG-CAM修饰的CNN不需要人类注意力作为输入。比较实验表明,具有GG-CAM扩展的两个标准CNN具有明显更大的分类性能。 RESNET50的曲线下位面积(AUC)指标从0.721美元增加到$ 0.776 $。对于高效NETV2(S),中位AUC从$ 0.723 $增加到$ 0.801 $。 GG-CAM还可以更好地解释网络,以促进弱监督的病理定位和分析。
The increased availability and accuracy of eye-gaze tracking technology has sparked attention-related research in psychology, neuroscience, and, more recently, computer vision and artificial intelligence. The attention mechanism in artificial neural networks is known to improve learning tasks. However, no previous research has combined the network attention and human attention. This paper describes a gaze-guided class activation mapping (GG-CAM) method to directly regulate the formation of network attention based on expert radiologists' visual attention for the chest X-ray pathology classification problem, which remains challenging due to the complex and often nuanced differences among images. GG-CAM is a lightweight ($3$ additional trainable parameters for regulating the learning process) and generic extension that can be easily applied to most classification convolutional neural networks (CNN). GG-CAM-modified CNNs do not require human attention as an input when fully trained. Comparative experiments suggest that two standard CNNs with the GG-CAM extension achieve significantly greater classification performance. The median area under the curve (AUC) metrics for ResNet50 increases from $0.721$ to $0.776$. For EfficientNetv2 (s), the median AUC increases from $0.723$ to $0.801$. The GG-CAM also brings better interpretability of the network that facilitates the weakly-supervised pathology localization and analysis.