注意分类：场景图分类与先验知识

论文标题

注意分类：场景图分类与先验知识

Classification by Attention: Scene Graph Classification with Prior Knowledge

论文作者

Sharifzadeh, Sahand, Baharlou, Sina Moayed, Tresp, Volker

论文摘要

场景图分类中的一个主要挑战是，从一个图像到另一个图像，对象和关系的外观可能显着不同。以前的作品通过对图像中的所有对象的关系推理或将先验知识纳入分类来解决这一问题。与以前的作品不同，我们不考虑单独的感知和先验知识模型。取而代之的是，我们采用多任务学习方法，在该方法中，我们将分类作为注意力层。这允许先验知识在感知模型中出现和传播。通过执行模型也代表先验，我们实现了强烈的电感偏差。我们表明，我们的模型可以准确地产生常识性知识，并且将这些知识的迭代注入到场景表示形式中会导致更高的分类性能。此外，我们的模型可以按照三倍的外部知识进行微调。当与自我监督的学习和仅1％的带注释的图像结合使用时，这给物体分类增长了3％以上，场景图分类中的26％，谓词预测准确性的36％。

A major challenge in scene graph classification is that the appearance of objects and relations can be significantly different from one image to another. Previous works have addressed this by relational reasoning over all objects in an image or incorporating prior knowledge into classification. Unlike previous works, we do not consider separate models for perception and prior knowledge. Instead, we take a multi-task learning approach, where we implement the classification as an attention layer. This allows for the prior knowledge to emerge and propagate within the perception model. By enforcing the model also to represent the prior, we achieve a strong inductive bias. We show that our model can accurately generate commonsense knowledge and that the iterative injection of this knowledge to scene representations leads to significantly higher classification performance. Additionally, our model can be fine-tuned on external knowledge given as triples. When combined with self-supervised learning and with 1% of annotated images only, this gives more than 3% improvement in object classification, 26% in scene graph classification, and 36% in predicate prediction accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题