基于3D骨架的动作识别的庞加莱几何形状中的混合尺寸

论文标题

基于3D骨架的动作识别的庞加莱几何形状中的混合尺寸

Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition

论文作者

Peng, Wei, Shi, Jingang, Xia, Zhaoqiang, Zhao, Guoying

论文摘要

图形卷积网络（GCN）已经证明了它们对不规则数据（例如，人类动作识别中的骨骼数据）进行建模的强大能力，为驻留在图的不同部分中的节点融合丰富的结构信息提供了一种令人兴奋的新方法。在人类的行动识别中，当前的作品引入了动态图生成机制，以更好地捕获潜在的语义骨架连接，从而改善了性能。在本文中，我们提供了一种探索基础连接的正交方式。我们没有引入昂贵的动态图生成范式，而是在Riemann歧管上构建了更有效的GCN，我们认为这是模拟图形数据的更合适的空间，以使提取的表示形式适合嵌入矩阵。具体而言，我们提出了一种新型的时空GCN（ST-GCN）结构，该结构是通过庞加莱几何形状定义的，使其能够更好地对结构数据的潜在解剖结构进行建模。为了进一步探索Riemann空间中的最佳投影维度，我们在歧管上混合了不同的尺寸，并提供了一种有效的方法来探索每个ST-GCN层的维度。借助最终的架构，我们在两个当前最大规模的3D数据集上评估了我们的方法，即NTU RGB+D和NTU RGB+D 120。比较结果表明，与先前的最佳GCN方法相比，该模型在任何给定的评估指标中都可以在任何给定的评估指标下实现卓越的性能，从而证明了我们的模型有效性。

Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data, e.g., skeletal data in human action recognition, providing an exciting new way to fuse rich structural information for nodes residing in different parts of a graph. In human action recognition, current works introduce a dynamic graph generation mechanism to better capture the underlying semantic skeleton connections and thus improves the performance. In this paper, we provide an orthogonal way to explore the underlying connections. Instead of introducing an expensive dynamic graph generation paradigm, we build a more efficient GCN on a Riemann manifold, which we think is a more suitable space to model the graph data, to make the extracted representations fit the embedding matrix. Specifically, we present a novel spatial-temporal GCN (ST-GCN) architecture which is defined via the Poincaré geometry such that it is able to better model the latent anatomy of the structure data. To further explore the optimal projection dimension in the Riemann space, we mix different dimensions on the manifold and provide an efficient way to explore the dimension for each ST-GCN layer. With the final resulted architecture, we evaluate our method on two current largest scale 3D datasets, i.e., NTU RGB+D and NTU RGB+D 120. The comparison results show that the model could achieve a superior performance under any given evaluation metrics with only 40\% model size when compared with the previous best GCN method, which proves the effectiveness of our model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题