无监督的多模式图像通过几何形状保存图像到图像翻译

论文标题

无监督的多模式图像通过几何形状保存图像到图像翻译

Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

论文作者

Arar, Moab, Ginger, Yiftach, Danon, Dov, Leizerson, Ilya, Bermano, Amit, Cohen-Or, Daniel

论文摘要

许多应用程序，例如自主驾驶，都在很大程度上依赖于多模式数据，在这些数据中，需要模态之间的空间对齐。大多数多模式的注册方法都使用普遍的跨模式相似性度量来计算图像之间的空间对应关系。在这项工作中，我们通过在两种输入方式上训练图像到图像翻译网络来绕过开发跨模式相似性度量的困难。这种学识渊博的翻译允许使用简单可靠的单模式指标训练注册网络。我们使用两个网络进行多模式注册 - 空间转换网络和一个翻译网络。我们表明，通过鼓励我们的翻译网络保存几何形状，我们设法训练了一个准确的空间转换网络。与最先进的多模式方法相比，我们提出的方法是无监督的，不需要对训练的对准方式成对，并且可以适应任何一对模态。我们在商业数据集上进行定量和质量评估我们的方法，表明它在几种方式上的性能很好，并且可以实现准确的对齐方式。

Many applications, such as autonomous driving, heavily rely on multi-modal data where spatial alignment between the modalities is required. Most multi-modal registration methods struggle computing the spatial correspondence between the images using prevalent cross-modality similarity measures. In this work, we bypass the difficulties of developing cross-modality similarity measures, by training an image-to-image translation network on the two input modalities. This learned translation allows training the registration network using simple and reliable mono-modality metrics. We perform multi-modal registration using two networks - a spatial transformation network and a translation network. We show that by encouraging our translation network to be geometry preserving, we manage to train an accurate spatial transformation network. Compared to state-of-the-art multi-modal methods our presented method is unsupervised, requiring no pairs of aligned modalities for training, and can be adapted to any pair of modalities. We evaluate our method quantitatively and qualitatively on commercial datasets, showing that it performs well on several modalities and achieves accurate alignment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题