论文标题
结构化的3D功能,用于重建可控的化身
Structured 3D Features for Reconstructing Controllable Avatars
论文作者
论文摘要
我们介绍了结构化的3D特征,这是一个基于新型隐式3D表示的模型,该模型将像素与像素的图像特征汇总到从参数,统计的人网格表面采样的密集的3D点上。 3D点具有关联的语义,并且可以在3D空间中自由移动。这允许对感兴趣的人进行最佳覆盖,而不仅仅是身体形状,进而有助于建模配件,头发和宽松的衣服。因此,我们提出了一个完整的基于3D变压器的注意框架,鉴于一个人以不受约束的姿势的单一图像,由于单一的端到端模型,训练有素的半耐受性,并没有其他后处理,因此会产生一种可动画的3D重建和照明分解。我们表明,我们的S3F模型超过了先前的有关各种任务的最先进,包括单眼3D重建以及反照率和阴影估计。此外,我们表明,所提出的方法允许新颖的视图综合,重新构建和重新构建重建,并且自然可以扩展以处理多个输入图像(例如,在视频中以不同的姿势或相同的视图,相同的视图,视频中的不同视图)。最后,我们为3D虚拟试用应用程序展示了模型的编辑功能。
We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface. The 3D points have associated semantics and can move freely in 3D space. This allows for optimal coverage of the person of interest, beyond just the body shape, which in turn, additionally helps modeling accessories, hair, and loose clothing. Owing to this, we present a complete 3D transformer-based attention framework which, given a single image of a person in an unconstrained pose, generates an animatable 3D reconstruction with albedo and illumination decomposition, as a result of a single end-to-end model, trained semi-supervised, and with no additional postprocessing. We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation. Moreover, we show that the proposed methodology allows novel view synthesis, relighting, and re-posing the reconstruction, and can naturally be extended to handle multiple input images (e.g. different views of a person, or the same view, in different poses, in video). Finally, we demonstrate the editing capabilities of our model for 3D virtual try-on applications.