论文标题
逐渐产生更好的初始猜测,以实现高质量人类运动预测的下一个阶段
Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction
论文作者
论文摘要
本文提出了一种高质量的人类运动预测方法,可以准确预测所观察到的人类姿势。我们的方法基于这样的观察,即对未来姿势的良好猜测有助于提高预测准确性。这促使我们提出了一个新颖的两阶段预测框架,包括一个初始预测网络,该网络只能计算出良好的猜测,然后是一个正式预测网络,该网络可以根据猜测预测目标未来所构成的正式预测网络。更重要的是,我们进一步扩展了这个想法,并设计了一个多阶段预测框架,每个阶段都可以预测下一阶段的初始猜测,从而带来更多的性能增长。为了完成每个阶段的预测任务,我们提出了一个包括空间密集图卷积网络(S-DGCN)和时间密集图卷积网络(T-DGCN)的网络。另外,执行两个网络有助于在整个姿势序列的全局接受场上提取时空特征。上述所有设计选择共同配合,使我们的方法的表现优于先前的方法:人类为6%-7%,CMU-MOCAP的5%-10%,3DPW的13%-16%。
This paper presents a high-quality human motion prediction method that accurately predicts future human poses given observed ones. Our method is based on the observation that a good initial guess of the future poses is very helpful in improving the forecasting accuracy. This motivates us to propose a novel two-stage prediction framework, including an init-prediction network that just computes the good guess and then a formal-prediction network that predicts the target future poses based on the guess. More importantly, we extend this idea further and design a multi-stage prediction framework where each stage predicts initial guess for the next stage, which brings more performance gain. To fulfill the prediction task at each stage, we propose a network comprising Spatial Dense Graph Convolutional Networks (S-DGCN) and Temporal Dense Graph Convolutional Networks (T-DGCN). Alternatively executing the two networks helps extract spatiotemporal features over the global receptive field of the whole pose sequence. All the above design choices cooperating together make our method outperform previous approaches by large margins: 6%-7% on Human3.6M, 5%-10% on CMU-MoCap, and 13%-16% on 3DPW.