一击的身份肖像重演

论文标题

一击的身份肖像重演

One-Shot Identity-Preserving Portrait Reenactment

论文作者

Xiang, Sitao, Gu, Yuming, Xiang, Pengda, He, Mingming, Nagano, Koki, Chen, Haiwei, Li, Hao

论文摘要

我们提供了一个基于学习的深度框架，用于从目标（一击）和一个驾驶主题的视频中的一张肖像重演。当目标和驾驶受试者不同（跨主题）时，现有的面部重演方法患有身份不匹配，并产生不一致的身份，尤其是在一次性设置中。在这项工作中，我们旨在从一张图片中解决跨主题肖像重演中的身份保存。我们引入了一种新颖的技术，可以将身份从表达和姿势中解脱出来，即使驾驶员的身份与目标大不相同，也可以保留肖像重演的身份。这是通过一个新颖的地标拆分网络（LD-NET）实现的，该网络预测了个性化的面部标志，将目标的身份与其他主题的表达和摆姿势相结合。为了处理来自看不见的主题的肖像重演，我们还引入了一个基于特征词典的生成对抗网络（FD-GAN），该网络将2D地标在本地转化为个性化的肖像，从而在大型姿势和表达变体下实现了一张肖像的重演。我们通过广泛的消融研究来验证我们的身份分解能力的有效性，我们的方法对跨主题肖像重演产生一致的身份。我们的综合实验表明，我们的方法显着优于最先进的单位面部面部重演方法。我们将发布我们的代码和模型以供学术使用。

We present a deep learning-based framework for portrait reenactment from a single picture of a target (one-shot) and a video of a driving subject. Existing facial reenactment methods suffer from identity mismatch and produce inconsistent identities when a target and a driving subject are different (cross-subject), especially in one-shot settings. In this work, we aim to address identity preservation in cross-subject portrait reenactment from a single picture. We introduce a novel technique that can disentangle identity from expressions and poses, allowing identity preserving portrait reenactment even when the driver's identity is very different from that of the target. This is achieved by a novel landmark disentanglement network (LD-Net), which predicts personalized facial landmarks that combine the identity of the target with expressions and poses from a different subject. To handle portrait reenactment from unseen subjects, we also introduce a feature dictionary-based generative adversarial network (FD-GAN), which locally translates 2D landmarks into a personalized portrait, enabling one-shot portrait reenactment under large pose and expression variations. We validate the effectiveness of our identity disentangling capabilities via an extensive ablation study, and our method produces consistent identities for cross-subject portrait reenactment. Our comprehensive experiments show that our method significantly outperforms the state-of-the-art single-image facial reenactment methods. We will release our code and models for academic use.

下载PDF全文

下载文献需遵守相关版权规定

论文标题