假新闻检测模型的对抗性基准

论文标题

假新闻检测模型的对抗性基准

An Adversarial Benchmark for Fake News Detection Models

论文作者

Flores, Lorenzo Jaime Yu, Hao, Yiding

论文摘要

随着在线错误信息的泛滥，假新闻发现在人工智能界变得非常重要。在本文中，我们提出了一个对抗性基准，该基准测试了假新闻探测器推理现实世界事实的能力。我们制定针对“理解”三个方面的对抗攻击：组成语义，词汇关系和对修饰符的敏感性。我们使用骗子Arxiv：Arch-Ever/1705648和Kaggle Fake-News数据集的BERT分类器测试我们的基准测试，并表明这两个模型都无法响应组成和词汇含义的变化。我们的结果加强了与其他事实检查方法结合使用此类模型的需求。

With the proliferation of online misinformation, fake news detection has gained importance in the artificial intelligence community. In this paper, we propose an adversarial benchmark that tests the ability of fake news detectors to reason about real-world facts. We formulate adversarial attacks that target three aspects of "understanding": compositional semantics, lexical relations, and sensitivity to modifiers. We test our benchmark using BERT classifiers fine-tuned on the LIAR arXiv:arch-ive/1705648 and Kaggle Fake-News datasets, and show that both models fail to respond to changes in compositional and lexical meaning. Our results strengthen the need for such models to be used in conjunction with other fact checking methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题