论文标题

为实体集生成类别

Generating Categories for Sets of Entities

论文作者

Zhang, Shuo, Balog, Krisztian, Callan, Jamie

论文摘要

类别系统是知识库的中心组成部分,因为它们提供了语义相关概念和实体的层次结构分组。它们是一种独特而有价值的资源,可在各种信息访问任务中使用。为了帮助知识编辑在扩展类别系统的手动过程中,本文介绍了一种为实体集生成类别的方法。首先,我们采用神经抽象摘要模型来生成候选类别。接下来,为每个候选人确定层次结构内的位置。最后,结构,内容和基于层次结构的特征用于对候选人进行排名,以通过最有前途的特征(以特异性,层次结构和重要性来衡量)。我们基于Wikipedia类别开发了一个测试收集,并证明了所提出的方法的有效性。

Category systems are central components of knowledge bases, as they provide a hierarchical grouping of semantically related concepts and entities. They are a unique and valuable resource that is utilized in a broad range of information access tasks. To aid knowledge editors in the manual process of expanding a category system, this paper presents a method of generating categories for sets of entities. First, we employ neural abstractive summarization models to generate candidate categories. Next, the location within the hierarchy is identified for each candidate. Finally, structure-, content-, and hierarchy-based features are used to rank candidates to identify by the most promising ones (measured in terms of specificity, hierarchy, and importance). We develop a test collection based on Wikipedia categories and demonstrate the effectiveness of the proposed approach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源