论文标题
利用姿势回归的模棱两可的特征
Leveraging Equivariant Features for Absolute Pose Regression
论文作者
论文摘要
尽管端到端方法在许多感知任务中都达到了最先进的表现,但他们尚未在姿势估计中与基于3D几何的方法竞争。此外,绝对姿势回归已被证明与图像检索更相关。结果,我们假设经典卷积神经网络学到的统计特征没有携带足够的几何信息来可靠地解决这一固有的几何任务。在本文中,我们演示了翻译和旋转模棱两可的卷积神经网络如何直接引起摄像机运动的表示。然后,我们证明这种几何属性允许在整个图像平面传播转换组中隐式增强训练数据。因此,我们认为,直接学习均等特征比学习数据密集型中间表示更为可取。全面的实验验证表明,我们的轻质模型在标准数据集中的现有模型优于现有模型。
While end-to-end approaches have achieved state-of-the-art performance in many perception tasks, they are not yet able to compete with 3D geometry-based methods in pose estimation. Moreover, absolute pose regression has been shown to be more related to image retrieval. As a result, we hypothesize that the statistical features learned by classical Convolutional Neural Networks do not carry enough geometric information to reliably solve this inherently geometric task. In this paper, we demonstrate how a translation and rotation equivariant Convolutional Neural Network directly induces representations of camera motions into the feature space. We then show that this geometric property allows for implicitly augmenting the training data under a whole group of image plane-preserving transformations. Therefore, we argue that directly learning equivariant features is preferable than learning data-intensive intermediate representations. Comprehensive experimental validation demonstrates that our lightweight model outperforms existing ones on standard datasets.