EAMM：通过基于音频的情绪感知运动模型的一声情感说话面孔

论文标题

EAMM：通过基于音频的情绪感知运动模型的一声情感说话面孔

EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model

论文作者

Ji, Xinya, Zhou, Hang, Wang, Kaisiyuan, Wu, Qianyi, Wu, Wayne, Xu, Feng, Cao, Xun

论文摘要

尽管已经对音频驱动的说话的面部产生取得了重大进展，但现有方法要么忽略面部情绪，要么不能应用于任意主题。在本文中，我们提出了情感感知的运动模型（EAMM），以通过涉及情感源视频来产生一声情感的面孔。具体来说，我们首先提出了一个音频2Facial-Dynamics模块，该模块从音频驱动的无监督零和一阶密钥点运动中进行说话。然后，通过探索运动模型的属性，我们进一步提出了一个隐性的情绪位移学习者，以表示与情绪相关的面部动力学作为对先前获得的运动表示形式的线性加性位移。全面的实验表明，通过合并两个模块的结果，我们的方法可以在具有现实情感模式的任意主题上产生令人满意的说话面部结果。

Although significant progress has been made to audio-driven talking face generation, existing methods either neglect facial emotion or cannot be applied to arbitrary subjects. In this paper, we propose the Emotion-Aware Motion Model (EAMM) to generate one-shot emotional talking faces by involving an emotion source video. Specifically, we first propose an Audio2Facial-Dynamics module, which renders talking faces from audio-driven unsupervised zero- and first-order key-points motion. Then through exploring the motion model's properties, we further propose an Implicit Emotion Displacement Learner to represent emotion-related facial dynamics as linearly additive displacements to the previously acquired motion representations. Comprehensive experiments demonstrate that by incorporating the results from both modules, our method can generate satisfactory talking face results on arbitrary subjects with realistic emotion patterns.

下载PDF全文

下载文献需遵守相关版权规定

论文标题