论文标题
细粒度场景图生成的分层记忆学习
Hierarchical Memory Learning for Fine-Grained Scene Graph Generation
论文作者
论文摘要
至于场景图的生成(SGG),由于众包标签,数据集中的粗谓词混合了,并且长尾问题也很明显。鉴于这种棘手的情况,许多现有的SGG方法在一个阶段的混合颗粒性谓词的监督下同等地对待谓词,并在一个阶段学习了模型,从而导致相对粗糙的预测。为了减轻次优的混合粒度注释和长尾效应问题的负面影响,本文提出了一种新型的层次记忆学习(HML)框架,以从简单到复杂的模型学习该模型,这类似于人类的层次结构学习过程。在粗糙和细谓词的自主分区之后,该模型首先在粗谓词上训练,然后学习细谓词。为了实现这种层次学习模式,本文首次使用新的概念重建(CR)和模型重建(MR)约束来制定HML框架。值得注意的是,HML框架可以作为改善各种SGG模型的一种一般优化策略,并且可以在SGG基准(即视觉基因组)上实现显着改进。
As far as Scene Graph Generation (SGG), coarse and fine predicates mix in the dataset due to the crowd-sourced labeling, and the long-tail problem is also pronounced. Given this tricky situation, many existing SGG methods treat the predicates equally and learn the model under the supervision of mixed-granularity predicates in one stage, leading to relatively coarse predictions. In order to alleviate the negative impact of the suboptimum mixed-granularity annotation and long-tail effect problems, this paper proposes a novel Hierarchical Memory Learning (HML) framework to learn the model from simple to complex, which is similar to the human beings' hierarchical memory learning process. After the autonomous partition of coarse and fine predicates, the model is first trained on the coarse predicates and then learns the fine predicates. In order to realize this hierarchical learning pattern, this paper, for the first time, formulates the HML framework using the new Concept Reconstruction (CR) and Model Reconstruction (MR) constraints. It is worth noticing that the HML framework can be taken as one general optimization strategy to improve various SGG models, and significant improvement can be achieved on the SGG benchmark (i.e., Visual Genome).