论文标题
self6d:自我监管的单眼6D对象姿势估计
Self6D: Self-Supervised Monocular 6D Object Pose Estimation
论文作者
论文摘要
6D对象姿势估计是计算机视觉中的一个基本问题。卷积神经网络(CNN)最近已被证明能够从单眼图像中预测可靠的6D姿势估计。尽管如此,CNN被认为是非常数据驱动的,并且获取足够的注释通常非常耗时且劳动量很大。为了克服这一缺点,我们提出了单眼6D的概念,构成了自我监督的学习,从而消除了对真实注释的需求。在培训了我们提出的网络通过合成RGB数据进行了充分监督之后,我们利用神经渲染的最新进展,进一步自我避免了未注释的实际RGB-D数据,从而寻求视觉和几何学上最佳的一致性。广泛的评估表明,我们提出的自我审议能够显着提高该模型的原始性能,超过了依靠合成数据或采用来自域适应领域的精心设计的所有其他方法。
6D object pose estimation is a fundamental problem in computer vision. Convolutional Neural Networks (CNNs) have recently proven to be capable of predicting reliable 6D pose estimates even from monocular images. Nonetheless, CNNs are identified as being extremely data-driven, and acquiring adequate annotations is oftentimes very time-consuming and labor intensive. To overcome this shortcoming, we propose the idea of monocular 6D pose estimation by means of self-supervised learning, removing the need for real annotations. After training our proposed network fully supervised with synthetic RGB data, we leverage recent advances in neural rendering to further self-supervise the model on unannotated real RGB-D data, seeking for a visually and geometrically optimal alignment. Extensive evaluations demonstrate that our proposed self-supervision is able to significantly enhance the model's original performance, outperforming all other methods relying on synthetic data or employing elaborate techniques from the domain adaptation realm.