一致性感知的图形网络用于人类互动理解

论文标题

一致性感知的图形网络用于人类互动理解

Consistency-Aware Graph Network for Human Interaction Understanding

论文作者

Wang, Zhenhua, Meng, Jiajun, Guo, Dongyan, Zhang, Jianhua, Shi, Javen Qinfeng, Chen, Shengyong

论文摘要

与人类活动分类所取得的进步相比，在人类互动理解（HIU）方面取得的成功要少得多。除了后一个任务更具挑战性，主要原因是最近的方法通过浅水图形模型学习人类互动关系，这是不足以建模复杂的人类互动。在本文中，我们提出了一个一致性感知的图形网络，该网络结合了图形网络的代表性能力和一致性吸引的推理以促进HIU任务。我们的网络由三个组件，一个用于提取图像特征的骨干CNN组成，是一个学习参与者之间三阶交互关系的因素图网络，以及一个一致性意识的推理模块，以执行标签和分组一致性。我们的主要观察结果是，HIU的一致性偏差可以嵌入到能量功能中，从而最大程度地减少提供一致的预测。提出了有效的均值推理算法，以便我们网络的所有模块都可以以端到端的方式共同训练。实验结果表明，我们的方法在三个基准上实现了领先的表现。

Compared with the progress made on human activity classification, much less success has been achieved on human interaction understanding (HIU). Apart from the latter task is much more challenging, the main cause is that recent approaches learn human interactive relations via shallow graphical models, which is inadequate to model complicated human interactions. In this paper, we propose a consistency-aware graph network, which combines the representative ability of graph network and the consistency-aware reasoning to facilitate the HIU task. Our network consists of three components, a backbone CNN to extract image features, a factor graph network to learn third-order interactive relations among participants, and a consistency-aware reasoning module to enforce labeling and grouping consistencies. Our key observation is that the consistency-aware-reasoning bias for HIU can be embedded into an energy function, minimizing which delivers consistent predictions. An efficient mean-field inference algorithm is proposed, such that all modules of our network could be trained jointly in an end-to-end manner. Experimental results show that our approach achieves leading performance on three benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题