DADC 2022的Longhorns：愚弄一个问题回答模型需要多少语言学家？对抗攻击的系统方法

论文标题

DADC 2022的Longhorns：愚弄一个问题回答模型需要多少语言学家？对抗攻击的系统方法

longhorns at DADC 2022: How many linguists does it take to fool a Question Answering model? A systematic approach to adversarial attacks

论文作者

Kovatchev, Venelin, Chatterjee, Trina, Govindarajan, Venkata S, Chen, Jifan, Choi, Eunsol, Chronis, Gabriella, Das, Anubrata, Erk, Katrin, Lease, Matthew, Li, Junyi Jessy, Wu, Yating, Mahowald, Kyle

论文摘要

开发对手挑战NLP系统的方法是提高模型性能和解释性的有前途的途径。在这里，我们描述了团队在第一个动态对抗数据收集（DADC）的任务1中“长角牛”的方法，该工作室要求团队手动欺骗一个模型，以挖掘出挖掘的问题回答任务。我们的团队首先结束，模型错误率为62％。我们主张采用系统的，语言知情的方法来制定对抗性问题，并描述了试点实验的结果以及我们的官方提交。

Developing methods to adversarially challenge NLP systems is a promising avenue for improving both model performance and interpretability. Here, we describe the approach of the team "longhorns" on Task 1 of the The First Workshop on Dynamic Adversarial Data Collection (DADC), which asked teams to manually fool a model on an Extractive Question Answering task. Our team finished first, with a model error rate of 62%. We advocate for a systematic, linguistically informed approach to formulating adversarial questions, and we describe the results of our pilot experiments, as well as our official submission.

下载PDF全文

下载文献需遵守相关版权规定

论文标题