Throapose：特征金字塔，用于在域移动下快速准确的物体姿势估计

论文标题

Throapose：特征金字塔，用于在域移动下快速准确的物体姿势估计

PyraPose: Feature Pyramids for Fast and Accurate Object Pose Estimation under Domain Shift

论文作者

Thalhammer, Stefan, Leitner, Markus, Patten, Timothy, Vincze, Markus

论文摘要

对象姿势估计使机器人能够理解并与环境进行交互。为了适应新的情况，必须使用合成数据进行培训。不幸的是，在域转移下的姿势估计，即对现实世界中的合成数据和测试培训，这是具有挑战性的。目前，基于深度学习的方法在使用编码器 - 码头网络时表现最好，但通常不推广到具有不同场景特征的新场景。我们认为，基于补丁的方法，而不是编码器 - 模型网络，更适合合成传输，因为更好地表示本地对象信息。为此，我们提出了一种基于专业特征金字塔网络的新方法，以计算多尺度特征，以并联在不同的特征映射分辨率上创建姿势假设。在多个标准数据集上评估了我们的单发姿势估计方法，并以高达35％的速度胜过最高的最新技术。我们还在现实世界中执行抓握实验，以证明使用合成数据推广到新的环境的优势。

Object pose estimation enables robots to understand and interact with their environments. Training with synthetic data is necessary in order to adapt to novel situations. Unfortunately, pose estimation under domain shift, i.e., training on synthetic data and testing in the real world, is challenging. Deep learning-based approaches currently perform best when using encoder-decoder networks but typically do not generalize to new scenarios with different scene characteristics. We argue that patch-based approaches, instead of encoder-decoder networks, are more suited for synthetic-to-real transfer because local to global object information is better represented. To that end, we present a novel approach based on a specialized feature pyramid network to compute multi-scale features for creating pose hypotheses on different feature map resolutions in parallel. Our single-shot pose estimation approach is evaluated on multiple standard datasets and outperforms the state of the art by up to 35%. We also perform grasping experiments in the real world to demonstrate the advantage of using synthetic data to generalize to novel environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题