L6DNET：光6 DOF网络，用于稳健和精确的对象姿势估计，带有小数据集

论文标题

L6DNET：光6 DOF网络，用于稳健和精确的对象姿势估计，带有小数据集

L6DNet: Light 6 DoF Network for Robust and Precise Object Pose Estimation with Small Datasets

论文作者

Gonzalez, Mathieu, Kacete, Amine, Murienne, Albert, Marchand, Eric

论文摘要

估计物体的3D姿势是一项具有挑战性的任务，可以在增强现实或机器人应用中考虑。在本文中，我们提出了一种新的方法，可以从单个RGB-D图像中执行6个DOF对象构成估计。我们在两个阶段采用混合管道：分别数据驱动和几何。数据驱动的步骤由一个分类CNN组成，该分类CNN从本地贴片中估算图像中的对象2D位置，然后进行回归CNN训练，以预测相机坐标系中一组关键点的3D位置。为了提取姿势信息，几何步骤包括将摄像机坐标系中的3D点与世界坐标系中相应的3D点对齐，通过最小化注册误差，从而计算姿势。我们在标准数据集纤维上的实验表明，我们的方法比最先进的方法更健壮和准确。该方法还经过验证，以通过视觉宣传来实现6 DOF定位任务。

Estimating the 3D pose of an object is a challenging task that can be considered within augmented reality or robotic applications. In this paper, we propose a novel approach to perform 6 DoF object pose estimation from a single RGB-D image. We adopt a hybrid pipeline in two stages: data-driven and geometric respectively. The data-driven step consists of a classification CNN to estimate the object 2D location in the image from local patches, followed by a regression CNN trained to predict the 3D location of a set of keypoints in the camera coordinate system. To extract the pose information, the geometric step consists in aligning the 3D points in the camera coordinate system with the corresponding 3D points in world coordinate system by minimizing a registration error, thus computing the pose. Our experiments on the standard dataset LineMod show that our approach is more robust and accurate than state-of-the-art methods. The approach is also validated to achieve a 6 DoF positioning task by visual servoing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题