论文标题
C2F-FWN:用于时空一致运动转移的粗到最新流动翘曲网络
C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer
论文作者
论文摘要
人类视频运动转移(HVMT)旨在综合一个人模仿他人行动的视频。尽管现有的基于GAN的HVMT方法取得了巨大的成功,但由于综合图像和示例性图像之间的空间一致性丧失,它们要么无法保留外观细节,要么由于视频框架之间缺乏时间一致性而产生不相互的视频结果。在本文中,我们提出了用于时空一致的HVMT的粗到精细流动翘曲网络(C2F-FWN)。尤其是,C2F-FWN利用了粗到细节的流量翘曲和布局约束可变形卷积(LC-DCONV)来提高空间一致性,并采用流动时间一致性(FTC)损失来提高时间一致性。此外,C2F-FWN提供了多源外观输入,可以以极大的灵活性和效率来支持外观属性编辑。除了公共数据集外,我们还收集了一个名为Solodance评估的大规模HVMT数据集。在我们的Solodance数据集和IPER数据集上进行的广泛实验表明,我们的方法在空间和时间一致性方面都优于最先进的HVMT方法。源代码和Solodance数据集可在https://github.com/wswdx/c2f-fwn上找到。
Human video motion transfer (HVMT) aims to synthesize videos that one person imitates other persons' actions. Although existing GAN-based HVMT methods have achieved great success, they either fail to preserve appearance details due to the loss of spatial consistency between synthesized and exemplary images, or generate incoherent video results due to the lack of temporal consistency among video frames. In this paper, we propose Coarse-to-Fine Flow Warping Network (C2F-FWN) for spatial-temporal consistent HVMT. Particularly, C2F-FWN utilizes coarse-to-fine flow warping and Layout-Constrained Deformable Convolution (LC-DConv) to improve spatial consistency, and employs Flow Temporal Consistency (FTC) Loss to enhance temporal consistency. In addition, provided with multi-source appearance inputs, C2F-FWN can support appearance attribute editing with great flexibility and efficiency. Besides public datasets, we also collected a large-scale HVMT dataset named SoloDance for evaluation. Extensive experiments conducted on our SoloDance dataset and the iPER dataset show that our approach outperforms state-of-art HVMT methods in terms of both spatial and temporal consistency. Source code and the SoloDance dataset are available at https://github.com/wswdx/C2F-FWN.