论文标题

SCDE:句子与考试中的高质量干扰器披肩数据集

SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations

论文作者

Kong, Xiang, Gangal, Varun, Hovy, Eduard

论文摘要

我们介绍了一个数据集SCDE,以通过句子预测评估计算模型的性能。 SCDE是一个由人类创建的句子披肩数据集,是根据公立学校英语考试收集的。我们的任务需要一个模型来填写来自共享候选人的段落中的多个空白,并带有由英语教师设计的干扰者。实验结果表明,这项任务需要使用直接句子邻里以外的非本地,话语级别的上下文。空白需要解决联合解决,并大大损害彼此的环境。此外,通过消融,我们表明干扰因素具有高质量,并使任务更具挑战性。我们的实验表明,高级模型(72%)和人类(87%)之间存在显着的性能差距,鼓励未来的模型弥合这一差距。

We introduce SCDE, a dataset to evaluate the performance of computational models through sentence prediction. SCDE is a human-created sentence cloze dataset, collected from public school English examinations. Our task requires a model to fill up multiple blanks in a passage from a shared candidate set with distractors designed by English teachers. Experimental results demonstrate that this task requires the use of non-local, discourse-level context beyond the immediate sentence neighborhood. The blanks require joint solving and significantly impair each other's context. Furthermore, through ablations, we show that the distractors are of high quality and make the task more challenging. Our experiments show that there is a significant performance gap between advanced models (72%) and humans (87%), encouraging future models to bridge this gap.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源