论文标题

通过全球和局部运动动力学的可控视频生成

Controllable Video Generation through Global and Local Motion Dynamics

论文作者

Davtyan, Aram, Favaro, Paolo

论文摘要

我们提出玻璃,这是一种全球和局部动作驱动序列合成的方法。玻璃是一种生成模型,以无监督的方式对视频序列进行了训练,并且可以在测试时对输入图像进行动画动画。该方法学会了将框架分为前景 - 背景层,并通过全球和局部动作表示,随着时间的推移而产生前景的过渡。全局动作与2D偏移明确相关,而局部动作则与(几何和光度)局部变形相关。 Glass使用复发性神经网络在帧之间过渡,并通过重建损失进行训练。我们还介绍了W-Sprites(步行精灵),这是一个具有预定义动作空间的新型合成数据集。我们在W-Sprites和真实数据集上评估了我们的方法,并发现Glass能够从单个输入图像中生成现实的视频序列,并成功地学习了比以前的工作更高级的动作空间。

We present GLASS, a method for Global and Local Action-driven Sequence Synthesis. GLASS is a generative model that is trained on video sequences in an unsupervised manner and that can animate an input image at test time. The method learns to segment frames into foreground-background layers and to generate transitions of the foregrounds over time through a global and local action representation. Global actions are explicitly related to 2D shifts, while local actions are instead related to (both geometric and photometric) local deformations. GLASS uses a recurrent neural network to transition between frames and is trained through a reconstruction loss. We also introduce W-Sprites (Walking Sprites), a novel synthetic dataset with a predefined action space. We evaluate our method on both W-Sprites and real datasets, and find that GLASS is able to generate realistic video sequences from a single input image and to successfully learn a more advanced action space than in prior work.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源