论文标题
检测天体物理学文献中的实体:基于单词和跨度实体识别方法的比较
Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods
论文作者
论文摘要
由于此类文本的高度专业化,因此从科学文献中提取的信息可能会具有挑战性。我们描述了作为交易的一部分开发的实体识别方法(检测天体物理文献中的实体)共享任务。该任务的目的是构建一个可以在天体物理学文献学术文章构成的数据集中识别指定实体的系统。我们计划参与,使我们能够在基于单词的标记和基于跨度的分类方法之间进行经验比较。当对组织者提供的两个隐藏测试集进行评估时,我们表现最佳的提交的$ F_1 $得分为0.8307(验证阶段)和0.7990(测试阶段)。
Information Extraction from scientific literature can be challenging due to the highly specialised nature of such text. We describe our entity recognition methods developed as part of the DEAL (Detecting Entities in the Astrophysics Literature) shared task. The aim of the task is to build a system that can identify Named Entities in a dataset composed by scholarly articles from astrophysics literature. We planned our participation such that it enables us to conduct an empirical comparison between word-based tagging and span-based classification methods. When evaluated on two hidden test sets provided by the organizer, our best-performing submission achieved $F_1$ scores of 0.8307 (validation phase) and 0.7990 (testing phase).