更多的爆炸：自然的扰动，以回答有力的问题

论文标题

更多的爆炸：自然的扰动，以回答有力的问题

More Bang for Your Buck: Natural Perturbation for Robust Question Answering

论文作者

Khashabi, Daniel, Khot, Tushar, Sabharwal, Ashish

论文摘要

尽管最近的模型在许多NLP数据集上都达到了人级得分，但我们观察到它们对输入的小变化非常敏感。作为通过构建全新示例培训集解决此问题的标准方法的替代方法，我们建议通过最小的示例扰动进行此操作。具体而言，我们的方法涉及首先收集一组种子示例，然后应用以人为驱动的自然扰动（而不是基于规则的机器扰动），这通常也会改变金标签。本地扰动的优点是，比编写全新的示例相对容易（因此便宜）。为了评估这种现象的影响，我们考虑了最近的提问数据集（BOOLQ），并研究我们的方法作为扰动成本比率的函数，这是扰动现有问题的相对成本与从头开始创建新的问题的相对成本。我们发现，当自然扰动更便宜地创建时，使用它们训练模型更有效：此类模型表现出更高的鲁棒性和更好的概括，同时保留了原始Boolq数据集的性能。

While recent models have achieved human-level scores on many NLP datasets, we observe that they are considerably sensitive to small changes in input. As an alternative to the standard approach of addressing this issue by constructing training sets of completely new examples, we propose doing so via minimal perturbation of examples. Specifically, our approach involves first collecting a set of seed examples and then applying human-driven natural perturbations (as opposed to rule-based machine perturbations), which often change the gold label as well. Local perturbations have the advantage of being relatively easier (and hence cheaper) to create than writing out completely new examples. To evaluate the impact of this phenomenon, we consider a recent question-answering dataset (BoolQ) and study the benefit of our approach as a function of the perturbation cost ratio, the relative cost of perturbing an existing question vs. creating a new one from scratch. We find that when natural perturbations are moderately cheaper to create, it is more effective to train models using them: such models exhibit higher robustness and better generalization, while retaining performance on the original BoolQ dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题