点：知识增强的语言表示信息提取

论文标题

点：知识增强的语言表示信息提取

SPOT: Knowledge-Enhanced Language Representations for Information Extraction

论文作者

Li, Jiacheng, Katsis, Yannis, Baldwin, Tyler, Kim, Ho-Cheol, Bartko, Andrew, McAuley, Julian, Hsu, Chun-Nan

论文摘要

与BERT等语言模型相比，已证明具有知识增强的语言表示预培训模型在知识基础构建任务（即〜关系提取）中更有效。这些知识增强的语言模型将知识纳入预训练中，以生成实体或关系的表示。但是，现有方法通常用单独的嵌入表示每个实体。结果，这些方法难以代表量不计的实体，并且必须使用基本的令牌模型（即〜变压器），并且必须使用大量参数，并且由于内存约束而在实践中可以处理的实体数量受到限制。此外，现有模型仍然难以同时代表实体和关系。为了解决这些问题，我们提出了一个新的预训练模型，该模型分别从图书中学习实体和关系的表示形式，并分别在文本中跨越跨度。通过使用跨度模块有效地编码跨度，我们的模型可以代表实体及其关系，但所需的参数比现有模型更少。我们通过从Wikipedia中提取的知识图对我们的模型进行了预训练，并在广泛的监督和无监督的信息提取任务上进行了测试。结果表明，我们的模型比基线相比，在监督的设置中，对实体和关系的表示更好，对我们的模型进行微调始终优于罗伯塔，并在信息提取任务上取得了竞争成果。

Knowledge-enhanced pre-trained models for language representation have been shown to be more effective in knowledge base construction tasks (i.e.,~relation extraction) than language models such as BERT. These knowledge-enhanced language models incorporate knowledge into pre-training to generate representations of entities or relationships. However, existing methods typically represent each entity with a separate embedding. As a result, these methods struggle to represent out-of-vocabulary entities and a large amount of parameters, on top of their underlying token models (i.e.,~the transformer), must be used and the number of entities that can be handled is limited in practice due to memory constraints. Moreover, existing models still struggle to represent entities and relationships simultaneously. To address these problems, we propose a new pre-trained model that learns representations of both entities and relationships from token spans and span pairs in the text respectively. By encoding spans efficiently with span modules, our model can represent both entities and their relationships but requires fewer parameters than existing models. We pre-trained our model with the knowledge graph extracted from Wikipedia and test it on a broad range of supervised and unsupervised information extraction tasks. Results show that our model learns better representations for both entities and relationships than baselines, while in supervised settings, fine-tuning our model outperforms RoBERTa consistently and achieves competitive results on information extraction tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题