6DOF对象姿势通过可区分的代理投票损失进行估算

论文标题

6DOF对象姿势通过可区分的代理投票损失进行估算

6DoF Object Pose Estimation via Differentiable Proxy Voting Loss

论文作者

Yu, Xin, Zhuang, Zheyu, Koniusz, Piotr, Li, Hongdong

论文摘要

由于遮挡或纹理的外观，估算单个图像中的6DOF对象姿势非常具有挑战性。基于矢量场的关键点投票表明了其有效性和优势在解决这些问题上。但是，矢量场的直接回归忽略了像素和关键点之间的距离也极大地影响了假设的偏差。换句话说，当像素远离关键点时，方向向量的小误差可能会产生严重偏差的假设。在本文中，我们旨在通过将像素和关键点之间的距离纳入我们的目标来减少此类错误。为此，我们开发了一个简单而有效的可区分投票损失（DPVL），该投票损失模仿了投票程序中的假设选择。通过利用我们的投票损失，我们能够以端到端的方式培训我们的网络。在广泛使用的数据集（即linemod和coclusion linemod）上进行的实验表明，我们的DPVL可显着提高姿势估计性能并加快训练收敛的速度。

Estimating a 6DOF object pose from a single image is very challenging due to occlusions or textureless appearances. Vector-field based keypoint voting has demonstrated its effectiveness and superiority on tackling those issues. However, direct regression of vector-fields neglects that the distances between pixels and keypoints also affect the deviations of hypotheses dramatically. In other words, small errors in direction vectors may generate severely deviated hypotheses when pixels are far away from a keypoint. In this paper, we aim to reduce such errors by incorporating the distances between pixels and keypoints into our objective. To this end, we develop a simple yet effective differentiable proxy voting loss (DPVL) which mimics the hypothesis selection in the voting procedure. By exploiting our voting loss, we are able to train our network in an end-to-end manner. Experiments on widely used datasets, i.e., LINEMOD and Occlusion LINEMOD, manifest that our DPVL improves pose estimation performance significantly and speeds up the training convergence.

下载PDF全文

下载文献需遵守相关版权规定

论文标题