论文标题
学习在人造环境的图像中解析线框
Learning to Parse Wireframes in Images of Man-Made Environments
论文作者
论文摘要
在本文中,我们提出了一种基于学习的方法,以自动提取杂乱的人造环境图像的“线框”表示。线框(见图1)包含所有显着的直线及其在有效,准确地编码大型几何形状和对象形状的场景的连接处。为此,我们构建了一个非常大的新数据集,其中包括5,000多个图像,并用人类彻底标记的线框。我们提出了两个卷积神经网络,这些神经网络分别适用于提取具有大量空间支持的连接和线条。在我们的数据集中训练的网络的性能要比分别用于连接检测和线段检测的最先进方法更好。我们已经进行了广泛的实验,以定量和定性地评估我们的方法获得的线框,并令人信服地表明,为人造环境的图像有效,有效地解析线框是可行的目标。这样的线框可以使许多重要的视觉任务受益,例如功能对应,3D重建,基于视觉的映射,本地化和导航。数据和源代码可在https://github.com/huangkuns/wireframe上找到。
In this paper, we propose a learning-based approach to the task of automatically extracting a "wireframe" representation for images of cluttered man-made environments. The wireframe (see Fig. 1) contains all salient straight lines and their junctions of the scene that encode efficiently and accurately large-scale geometry and object shapes. To this end, we have built a very large new dataset of over 5,000 images with wireframes thoroughly labelled by humans. We have proposed two convolutional neural networks that are suitable for extracting junctions and lines with large spatial support, respectively. The networks trained on our dataset have achieved significantly better performance than state-of-the-art methods for junction detection and line segment detection, respectively. We have conducted extensive experiments to evaluate quantitatively and qualitatively the wireframes obtained by our method, and have convincingly shown that effectively and efficiently parsing wireframes for images of man-made environments is a feasible goal within reach. Such wireframes could benefit many important visual tasks such as feature correspondence, 3D reconstruction, vision-based mapping, localization, and navigation. The data and source code are available at https://github.com/huangkuns/wireframe.