论文标题
Transiam:使用变压器进行医疗图像分割的多模式视觉特征融合多模式的视觉特征
TranSiam: Fusing Multimodal Visual Features Using Transformer for Medical Image Segmentation
论文作者
论文摘要
基于多模式的自动分割医学图像是疾病诊断的重要主题。尽管已证明卷积神经网络(CNN)在图像分割任务中具有出色的性能,但很难获得全球信息。缺乏全球信息将严重影响病变区域细分结果的准确性。此外,同一患者的多模式数据之间存在视觉表示差异。这些差异将影响自动分割方法的结果。为了解决这些问题,我们提出了一种适用于可以捕获名为Transiam的全局信息的多模式医学图像的分割方法。 Transiam是一个2D双路径网络,它提取不同方式的特征。在每条路径中,我们利用卷积在低级阶段提取详细信息,并设计一个ICMT块以在高级阶段提取全局信息。 ICMT块将卷积嵌入变压器中,该卷积可以在保留空间和详细信息的同时提取全局信息。此外,我们设计了一种基于交叉注意和自我注意的新型融合机制,称为TMM块,该机制可以有效地融合不同方式之间的特征。在Brats 2019和Brats 2020多模式数据集上,我们的准确性与其他流行方法相比有了显着提高。
Automatic segmentation of medical images based on multi-modality is an important topic for disease diagnosis. Although the convolutional neural network (CNN) has been proven to have excellent performance in image segmentation tasks, it is difficult to obtain global information. The lack of global information will seriously affect the accuracy of the segmentation results of the lesion area. In addition, there are visual representation differences between multimodal data of the same patient. These differences will affect the results of the automatic segmentation methods. To solve these problems, we propose a segmentation method suitable for multimodal medical images that can capture global information, named TranSiam. TranSiam is a 2D dual path network that extracts features of different modalities. In each path, we utilize convolution to extract detailed information in low level stage, and design a ICMT block to extract global information in high level stage. ICMT block embeds convolution in the transformer, which can extract global information while retaining spatial and detailed information. Furthermore, we design a novel fusion mechanism based on cross attention and selfattention, called TMM block, which can effectively fuse features between different modalities. On the BraTS 2019 and BraTS 2020 multimodal datasets, we have a significant improvement in accuracy over other popular methods.