论文标题
稀疏形式:基于注意力的深度完成网络
SparseFormer: Attention-based Depth Completion Network
论文作者
论文摘要
大多数用于增强和虚拟现实的管道通过创建稀疏的3D地标图来估计相机的自我动机。在本文中,我们解决了深度完成问题的问题,即使用RGB图像作为指导,使这张稀疏的3D地图致密。由于SFM和SLAM管道产生的低密度,不均匀且容易出现的3D地标,这仍然是一个具有挑战性的问题。我们介绍了一个变压器块,稀疏形式,该块融合了具有深度视觉特征的3D地标,以产生密集的深度。稀疏形式具有一个全球接收场,使该模块在低密度和不均匀地标的深度完成中特别有效。为了解决3D地标之间深度离群值的问题,我们引入了一个可训练的改进模块,该模块通过稀疏地标之间的注意来过滤异常值。
Most pipelines for Augmented and Virtual Reality estimate the ego-motion of the camera by creating a map of sparse 3D landmarks. In this paper, we tackle the problem of depth completion, that is, densifying this sparse 3D map using RGB images as guidance. This remains a challenging problem due to the low density, non-uniform and outlier-prone 3D landmarks produced by SfM and SLAM pipelines. We introduce a transformer block, SparseFormer, that fuses 3D landmarks with deep visual features to produce dense depth. The SparseFormer has a global receptive field, making the module especially effective for depth completion with low-density and non-uniform landmarks. To address the issue of depth outliers among the 3D landmarks, we introduce a trainable refinement module that filters outliers through attention between the sparse landmarks.