论文标题
从2D图像到3D模型:弱监督的多视图面部重建具有深层融合
From 2D Images to 3D Model:Weakly Supervised Multi-View Face Reconstruction with Deep Fusion
论文作者
论文摘要
虽然弱监督的多视图面部重建(MVR)引起了人们的注意,但一个关键问题仍然保持开放:如何有效互动和融合多个图像信息以重建高精度3D模型。在这方面,我们提出了一条名为Deep Fusion MVR(DF-MVR)的新型管道,以探索多视图图像和重建高精度3D面之间的特征对应关系。具体而言,我们提出了一种新颖的多视图融合主链,该主链利用面罩来对齐多个编码器,并集成了一种多层注意机制,以增强特征交互和融合,从而产生一个统一的面部表现。此外,我们开发了一种简洁的面膜机制,可通过识别共同区域并指导网络对关键面部特征(例如,眼睛,眉毛,鼻子和嘴巴)来促进多视图融合和面部重建。对像素面和Bosphorus数据集的实验表明了我们模型的优越性。没有3D注释,DF-MVR分别比现有的Pixel-Face和Bosphorus数据集的MVR分别获得了5.2%和3.0%的RMSE改善。代码将在https://github.com/weiguangzhao/df_mvr上公开提供。
While weakly supervised multi-view face reconstruction (MVR) is garnering increased attention, one critical issue still remains open: how to effectively interact and fuse multiple image information to reconstruct high-precision 3D models. In this regard, we propose a novel pipeline called Deep Fusion MVR (DF-MVR) to explore the feature correspondences between multi-view images and reconstruct high-precision 3D faces. Specifically, we present a novel multi-view feature fusion backbone that utilizes face masks to align features from multiple encoders and integrates one multi-layer attention mechanism to enhance feature interaction and fusion, resulting in one unified facial representation. Additionally, we develop one concise face mask mechanism that facilitates multi-view feature fusion and facial reconstruction by identifying common areas and guiding the network's focus on critical facial features (e.g., eyes, brows, nose, and mouth). Experiments on Pixel-Face and Bosphorus datasets indicate the superiority of our model. Without 3D annotation, DF-MVR achieves 5.2% and 3.0% RMSE improvement over the existing weakly supervised MVRs respectively on Pixel-Face and Bosphorus dataset. Code will be available publicly at https://github.com/weiguangzhao/DF_MVR.