论文标题
IMG2POSE:通过6DOF,面部姿势估计的面部对齐和检测
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation
论文作者
论文摘要
我们提出实时的,六个自由度(6DOF),3D面姿势估计,无面部检测或具有里程碑意义的定位。我们观察到,估计面部的6DOF刚性转换比面部标志性检测更简单,通常用于3D面对面。此外,6DOF提供的信息比面界框标签更多。我们利用这些观察结果做出了多种贡献:(a)我们描述了一个易于训练,高效,基于R-CNN的模型,该模型可为照片中的所有面部的所有面部回归6DOF姿势,而无需初步的面部检测。 (b)我们解释了如何在训练和评估我们的模型时创建的输入照片和任意作物之间的转换和保持一致的姿势。 (c)最后,我们展示了面对面如何替换检测边界框训练标签。对AFLW2000-3D和BIWI的测试表明,我们的方法是在实时进行的,并且胜过最先进的状态(SOTA)面对姿势估计器。值得注意的是,尽管未在边界框标签上进行优化,但我们的方法还超过了更宽的面部检测基准的SOTA模型。
We propose real-time, six degrees of freedom (6DoF), 3D face pose estimation without face detection or landmark localization. We observe that estimating the 6DoF rigid transformation of a face is a simpler problem than facial landmark detection, often used for 3D face alignment. In addition, 6DoF offers more information than face bounding box labels. We leverage these observations to make multiple contributions: (a) We describe an easily trained, efficient, Faster R-CNN--based model which regresses 6DoF pose for all faces in the photo, without preliminary face detection. (b) We explain how pose is converted and kept consistent between the input photo and arbitrary crops created while training and evaluating our model. (c) Finally, we show how face poses can replace detection bounding box training labels. Tests on AFLW2000-3D and BIWI show that our method runs at real-time and outperforms state of the art (SotA) face pose estimators. Remarkably, our method also surpasses SotA models of comparable complexity on the WIDER FACE detection benchmark, despite not been optimized on bounding box labels.