从最坏的情况下学习：动态生成的数据集，以改善在线仇恨检测

论文标题

从最坏的情况下学习：动态生成的数据集，以改善在线仇恨检测

Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection

论文作者

Vidgen, Bertie, Thrush, Tristan, Waseem, Zeerak, Kiela, Douwe

论文摘要

我们提出了一个人类和模型的过程，用于动态生成数据集和训练更好的性能和更健壮的仇恨检测模型。我们提供了一个约40,000个条目的新数据集，该数据集由训练有素的注释者在四轮动态数据创建中生成和标记。它包括约15,000个具有挑战性的扰动，每个仇恨进入都有针对仇恨的类型和目标的细粒标签。仇恨条目占数据集的54％，该数据集比可比的数据集高得多。我们表明，使用这种方法可以大大提高模型性能。在后来的数据收集过程中训练的模型在测试集上的表现更好，并且对注释者更难欺骗。他们在Hatecheck上的表现也更好，Hatecheck是一套用于在线仇恨检测的功能测试。我们提供其他研究人员使用的代码，数据集和注释指南。在ACL 2021接受。

We present a human-and-model-in-the-loop process for dynamically generating datasets and training better performing and more robust hate detection models. We provide a new dataset of ~40,000 entries, generated and labelled by trained annotators over four rounds of dynamic data creation. It includes ~15,000 challenging perturbations and each hateful entry has fine-grained labels for the type and target of hate. Hateful entries make up 54% of the dataset, which is substantially higher than comparable datasets. We show that model performance is substantially improved using this approach. Models trained on later rounds of data collection perform better on test sets and are harder for annotators to trick. They also perform better on HateCheck, a suite of functional tests for online hate detection. We provide the code, dataset and annotation guidelines for other researchers to use. Accepted at ACL 2021.

下载PDF全文

下载文献需遵守相关版权规定

论文标题