论文标题

Biolord:从定义中学习本体论表示(用于生物医学概念及其文本描述)

BioLORD: Learning Ontological Representations from Definitions (for Biomedical Concepts and their Textual Descriptions)

论文作者

Remy, François, Demuynck, Kris, Demeester, Thomas

论文摘要

这项工作介绍了Biolord,这是一种新的预训练策略,用于为临床句子和生物医学概念生成有意义的表述。最先进的方法论是通过最大化代表同一概念的名称表示的相似性,并通过对比度学习来防止崩溃。但是,由于生物医学名称并不总是自称,因此有时会导致非语义表示。 Biolord通过使用定义将其概念表示基于其概念表示,以及由由生物医学本体论组成的多个关系知识图得出的简短描述。借助这一基础,我们的模型产生了更多的语义概念表示形式,这些表示与本体论的层次结构更加匹配。 Biolord在临床句子(MEDST)和生物医学概念(MayoSRS)上建立了新的最新技术状态,以实现文本相似性。

This work introduces BioLORD, a new pre-training strategy for producing meaningful representations for clinical sentences and biomedical concepts. State-of-the-art methodologies operate by maximizing the similarity in representation of names referring to the same concept, and preventing collapse through contrastive learning. However, because biomedical names are not always self-explanatory, it sometimes results in non-semantic representations. BioLORD overcomes this issue by grounding its concept representations using definitions, as well as short descriptions derived from a multi-relational knowledge graph consisting of biomedical ontologies. Thanks to this grounding, our model produces more semantic concept representations that match more closely the hierarchical structure of ontologies. BioLORD establishes a new state of the art for text similarity on both clinical sentences (MedSTS) and biomedical concepts (MayoSRS).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源