具有明确连接的弱布局线索的单眼球形深度估计

论文标题

具有明确连接的弱布局线索的单眼球形深度估计

Monocular Spherical Depth Estimation with Explicitly Connected Weak Layout Cues

论文作者

Zioulis, Nikolaos, Alvarez, Federico, Zarpalas, Dimitrios, Daras, Petros

论文摘要

球形摄像机以整体方式捕获场景，并已用于房间布局估算。最近，随着适当数据集的可用性，从单个全向图像中也取得了深入估算的进展。尽管这两个任务是互补的，但很少有作品能够并行探索它们以提高室内几何感知，而那些这样做的工作要么依赖于合成数据，要么使用了小型数据集，因为在真实场景中，很少有选项包括布局注释和密集的深度图。这部分是由于需要对房间布局进行手动注释。在这项工作中，我们超越了此限制，并生成一个360几何视觉（360V）数据集，该数据集包括多种模式，多视图立体声数据并自动生成弱布局提示。我们还探索了两个任务之间的明确耦合，以将它们集成到经过单打的训练模型中。我们依靠基于深度的布局重建和基于布局的深度注意，这表明了这两个任务的性能都会提高。通过使用单个360摄像机扫描房间，出现了便捷和快速建筑规模3D扫描的机会。

Spherical cameras capture scenes in a holistic manner and have been used for room layout estimation. Recently, with the availability of appropriate datasets, there has also been progress in depth estimation from a single omnidirectional image. While these two tasks are complementary, few works have been able to explore them in parallel to advance indoor geometric perception, and those that have done so either relied on synthetic data, or used small scale datasets, as few options are available that include both layout annotations and dense depth maps in real scenes. This is partly due to the necessity of manual annotations for room layouts. In this work, we move beyond this limitation and generate a 360 geometric vision (360V) dataset that includes multiple modalities, multi-view stereo data and automatically generated weak layout cues. We also explore an explicit coupling between the two tasks to integrate them into a singleshot trained model. We rely on depth-based layout reconstruction and layout-based depth attention, demonstrating increased performance across both tasks. By using single 360 cameras to scan rooms, the opportunity for facile and quick building-scale 3D scanning arises.

下载PDF全文

下载文献需遵守相关版权规定

论文标题