稀疏形式：基于注意力的深度完成网络

论文标题

稀疏形式：基于注意力的深度完成网络

SparseFormer: Attention-based Depth Completion Network

论文作者

Warburg, Frederik, Ramamonjisoa, Michael, López-Antequera, Manuel

论文摘要

大多数用于增强和虚拟现实的管道通过创建稀疏的3D地标图来估计相机的自我动机。在本文中，我们解决了深度完成问题的问题，即使用RGB图像作为指导，使这张稀疏的3D地图致密。由于SFM和SLAM管道产生的低密度，不均匀且容易出现的3D地标，这仍然是一个具有挑战性的问题。我们介绍了一个变压器块，稀疏形式，该块融合了具有深度视觉特征的3D地标，以产生密集的深度。稀疏形式具有一个全球接收场，使该模块在低密度和不均匀地标的深度完成中特别有效。为了解决3D地标之间深度离群值的问题，我们引入了一个可训练的改进模块，该模块通过稀疏地标之间的注意来过滤异常值。

Most pipelines for Augmented and Virtual Reality estimate the ego-motion of the camera by creating a map of sparse 3D landmarks. In this paper, we tackle the problem of depth completion, that is, densifying this sparse 3D map using RGB images as guidance. This remains a challenging problem due to the low density, non-uniform and outlier-prone 3D landmarks produced by SfM and SLAM pipelines. We introduce a transformer block, SparseFormer, that fuses 3D landmarks with deep visual features to produce dense depth. The SparseFormer has a global receptive field, making the module especially effective for depth completion with low-density and non-uniform landmarks. To address the issue of depth outliers among the 3D landmarks, we introduce a trainable refinement module that filters outliers through attention between the sparse landmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题