自然语言推断向生物医学问题回答的可转让性

论文标题

自然语言推断向生物医学问题回答的可转让性

Transferability of Natural Language Inference to Biomedical Question Answering

论文作者

Jeong, Minbyul, Sung, Mujeen, Kim, Gangwoo, Kim, Donghyeon, Yoon, Wonjin, Yoo, Jaehyo, Kang, Jaewoo

论文摘要

由于数据稀缺和领域专业知识的要求，生物医学问题回答（QA）是一项艰巨的任务。预训练的语言模型已用于解决这些问题。最近，句子对之间的学习关系已被证明可以提高质量检查中的绩效。在本文中，我们专注于应用生物学家将自然语言推理（NLI）的知识转移到生物医学质量检查中。我们观察到，在NLI数据集上接受培训的Biobert在是/否（+5.59％），FACTOID（+0.53％），列表类型（+13.58％）的问题上获得了更好的性能（+13.58％）的问题，与以前的挑战（BioASQ 7B期B期）相比。我们提出了一种顺序传递学习方法，该方法在第八届BioASQ挑战（B期）中表现出色。在顺序转移学习中，任务进行微调的顺序很重要。当将FACTOID和列表类型问题的格式转换为Stanford问题回答数据集（Squel）的格式时，我们测量了提取质量检查设置的无法回答的速率。

Biomedical question answering (QA) is a challenging task due to the scarcity of data and the requirement of domain expertise. Pre-trained language models have been used to address these issues. Recently, learning relationships between sentence pairs has been proved to improve performance in general QA. In this paper, we focus on applying BioBERT to transfer the knowledge of natural language inference (NLI) to biomedical QA. We observe that BioBERT trained on the NLI dataset obtains better performance on Yes/No (+5.59%), Factoid (+0.53%), List type (+13.58%) questions compared to performance obtained in a previous challenge (BioASQ 7B Phase B). We present a sequential transfer learning method that significantly performed well in the 8th BioASQ Challenge (Phase B). In sequential transfer learning, the order in which tasks are fine-tuned is important. We measure an unanswerable rate of the extractive QA setting when the formats of factoid and list type questions are converted to the format of the Stanford Question Answering Dataset (SQuAD).

下载PDF全文

下载文献需遵守相关版权规定

论文标题