论文标题
假新闻检测模型的对抗性基准
An Adversarial Benchmark for Fake News Detection Models
论文作者
论文摘要
随着在线错误信息的泛滥,假新闻发现在人工智能界变得非常重要。在本文中,我们提出了一个对抗性基准,该基准测试了假新闻探测器推理现实世界事实的能力。我们制定针对“理解”三个方面的对抗攻击:组成语义,词汇关系和对修饰符的敏感性。我们使用骗子Arxiv:Arch-Ever/1705648和Kaggle Fake-News数据集的BERT分类器测试我们的基准测试,并表明这两个模型都无法响应组成和词汇含义的变化。我们的结果加强了与其他事实检查方法结合使用此类模型的需求。
With the proliferation of online misinformation, fake news detection has gained importance in the artificial intelligence community. In this paper, we propose an adversarial benchmark that tests the ability of fake news detectors to reason about real-world facts. We formulate adversarial attacks that target three aspects of "understanding": compositional semantics, lexical relations, and sensitivity to modifiers. We test our benchmark using BERT classifiers fine-tuned on the LIAR arXiv:arch-ive/1705648 and Kaggle Fake-News datasets, and show that both models fail to respond to changes in compositional and lexical meaning. Our results strengthen the need for such models to be used in conjunction with other fact checking methods.