基于关键点的类别级对象姿势姿势跟踪来自RGB序列，具有不确定性估计

论文标题

基于关键点的类别级对象姿势姿势跟踪来自RGB序列，具有不确定性估计

Keypoint-Based Category-Level Object Pose Tracking from an RGB Sequence with Uncertainty Estimation

论文作者

Lin, Yunzhi, Tremblay, Jonathan, Tyree, Stephen, Vela, Patricio A., Birchfield, Stan

论文摘要

我们提出了一种单级，类别级别的6-DOF姿势估计算法，该算法同时检测并跟踪已知类别中对象的实例。我们的方法将单眼RGB视频的先前和当前框架作为输入，以及从上一个帧的预测来预测边界的cuboid和6-Dof姿势（最多缩放）。在内部，一个深层网络可以预测图像坐标中对象关键点（边界核心的顶点）的分布，此后，在使用PNP计算最终姿势之前，新颖的概率过滤过程会在估计中积分。我们的框架使系统在预测当前帧时可以考虑以前的不确定性，从而导致预测比单帧方法更准确和稳定。广泛的实验表明，我们的方法在注释的对象视频的挑战性对象基准上优于现有方法。我们还在增强现实环境中展示了我们工作的可用性。

We propose a single-stage, category-level 6-DoF pose estimation algorithm that simultaneously detects and tracks instances of objects within a known category. Our method takes as input the previous and current frame from a monocular RGB video, as well as predictions from the previous frame, to predict the bounding cuboid and 6-DoF pose (up to scale). Internally, a deep network predicts distributions over object keypoints (vertices of the bounding cuboid) in image coordinates, after which a novel probabilistic filtering process integrates across estimates before computing the final pose using PnP. Our framework allows the system to take previous uncertainties into consideration when predicting the current frame, resulting in predictions that are more accurate and stable than single frame methods. Extensive experiments show that our method outperforms existing approaches on the challenging Objectron benchmark of annotated object videos. We also demonstrate the usability of our work in an augmented reality setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题