3DPVNET：补丁级3D霍夫投票网络6D姿势估计

论文标题

3DPVNET：补丁级3D霍夫投票网络6D姿势估计

3DPVNet: Patch-level 3D Hough Voting Network for 6D Pose Estimation

论文作者

Liu, Yuanpeng, Zhou, Jun, Zhang, Yuqi, Ding, Chao, Wang, Jun

论文摘要

在本文中，我们专注于估计点云中对象的6D姿势。尽管该主题已经进行了广泛的研究，但由于噪声和阻塞，点云中的姿势估计仍然是一个具有挑战性的问题。为了解决这个问题，这项工作中介绍了一个新颖的3DPVNET，该工作利用3D本地补丁投票给对象6D姿势。 3DPVNET由三个模块组成。特别是，首先引入一个补丁统一（\ textbf {pu}）模块以归一化输入补丁，并在其上创建标准的本地坐标框架以生成可靠的投票。然后，我们在网络中设计了一个重量引导的相邻特征融合（\ textbf {wnff}）模块，该模块融合了相邻功能，以产生中心贴片的半全球功能。 WNFF模块将局部贴片的相邻信息挖掘出来，从而显着增强了局部几何特征的表示能力，从而使该方法稳健地稳定到一定水平的噪声。此外，我们提出一个补丁级投票（\ textbf {pv}）模块，以回归转换并生成姿势投票。在汇总所有投票和改进步骤中汇总了所有投票之后，可以获得对象的最终姿势。与最近的基于投票的方法相比，3DPVNET是补丁级别，直接在点云上进行。因此，3DPVNET比点/像素级投票方案的计算少，并且对部分数据具有稳健性。几个数据集的实验表明，3DPVNET实现了最新的性能，并且对噪声和遮挡也很强。

In this paper, we focus on estimating the 6D pose of objects in point clouds. Although the topic has been widely studied, pose estimation in point clouds remains a challenging problem due to the noise and occlusion. To address the problem, a novel 3DPVNet is presented in this work, which utilizes 3D local patches to vote for the object 6D poses. 3DPVNet is comprised of three modules. In particular, a Patch Unification (\textbf{PU}) module is first introduced to normalize the input patch, and also create a standard local coordinate frame on it to generate a reliable vote. We then devise a Weight-guided Neighboring Feature Fusion (\textbf{WNFF}) module in the network, which fuses the neighboring features to yield a semi-global feature for the center patch. WNFF module mines the neighboring information of a local patch, such that the representation capability to local geometric characteristics is significantly enhanced, making the method robust to a certain level of noise. Moreover, we present a Patch-level Voting (\textbf{PV}) module to regress transformations and generates pose votes. After the aggregation of all votes from patches and a refinement step, the final pose of the object can be obtained. Compared to recent voting-based methods, 3DPVNet is patch-level, and directly carried out on point clouds. Therefore, 3DPVNet achieves less computation than point/pixel-level voting scheme, and has robustness to partial data. Experiments on several datasets demonstrate that 3DPVNet achieves the state-of-the-art performance, and is also robust against noise and occlusions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题