论文标题
动态GCN:基于骨架的动作识别的上下文增强的拓扑学习
Dynamic GCN: Context-enriched Topology Learning for Skeleton-based Action Recognition
论文作者
论文摘要
图形卷积网络(GCN)对基于骨架的动作识别的任务引起了越来越多的兴趣。关键在于图形结构的设计,该设计编码骨架拓扑信息。在本文中,我们提出了动态GCN,其中引入了一个新颖的卷积神经网络(CEN),以自动学习骨架拓扑。特别是,当学习两个关节之间的依赖性时,其余关节的上下文特征将以全球方式纳入。 CEN非常轻巧但有效,并且可以嵌入图卷积层中。通过堆叠多个启用CEN的图形卷积层,我们构建动态GCN。值得注意的是,作为CEN的优点,为不同的输入样品以及各种深度的图形卷积层构建了动态图形拓扑。此外,对三个替代上下文建模架构进行了充分的探索,这可能是对图形拓扑学习的未来研究的指南。 CEN仅为基线模型带来约7%的额外拖鞋,而动态GCN的性能更高,$ 2 \ times $〜$〜$〜$ 4 \ times $ $ $ $比现有方法少。通过进一步结合静态身体连接和运动方式,我们在三个大规模基准(即NTU-RGB+D,NTU-RGB+D 120和Skeleton-Keleton-Keletonsics)上实现了最先进的性能。
Graph Convolutional Networks (GCNs) have attracted increasing interests for the task of skeleton-based action recognition. The key lies in the design of the graph structure, which encodes skeleton topology information. In this paper, we propose Dynamic GCN, in which a novel convolutional neural network named Contextencoding Network (CeN) is introduced to learn skeleton topology automatically. In particular, when learning the dependency between two joints, contextual features from the rest joints are incorporated in a global manner. CeN is extremely lightweight yet effective, and can be embedded into a graph convolutional layer. By stacking multiple CeN-enabled graph convolutional layers, we build Dynamic GCN. Notably, as a merit of CeN, dynamic graph topologies are constructed for different input samples as well as graph convolutional layers of various depths. Besides, three alternative context modeling architectures are well explored, which may serve as a guideline for future research on graph topology learning. CeN brings only ~7% extra FLOPs for the baseline model, and Dynamic GCN achieves better performance with $2\times$~$4\times$ fewer FLOPs than existing methods. By further combining static physical body connections and motion modalities, we achieve state-of-the-art performance on three large-scale benchmarks, namely NTU-RGB+D, NTU-RGB+D 120 and Skeleton-Kinetics.