探索具有对比场景上下文的数据有效的3D场景理解

论文标题

探索具有对比场景上下文的数据有效的3D场景理解

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

论文作者

Hou, Ji, Graham, Benjamin, Nießner, Matthias, Xie, Saining

论文摘要

3D场景中的快速进步是对数据需求不断增长的。但是，众所周知，收集和注释3D场景（例如点云）很难。例如，可以访问和扫描的场景数（例如室内房间）可能受到限制；即使有足够的数据，获得3D标签（例如实例面具）也需要大量的人工劳动。在本文中，我们探讨了3D点云的数据效率学习。作为朝着这个方向发展的第一步，我们提出了对比的场景上下文，这是一种3D预训练方法，可利用场景中的点级对应关系和空间上下文。我们的方法在稀缺的培训数据或标签的基准套件上实现了最先进的结果。我们的研究表明，3D点云的详尽标记可能是不必要的。值得注意的是，在扫描仪上，即使使用了0.1％的点标签，我们仍然达到了使用完整注释的基线性能的89％（实例分段）和96％（语义分割）。

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we explore data-efficient learning for 3D point cloud. As a first step towards this direction, we propose Contrastive Scene Contexts, a 3D pre-training method that makes use of both point-level correspondences and spatial contexts in a scene. Our method achieves state-of-the-art results on a suite of benchmarks where training data or labels are scarce. Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题