仇恨言论检测的深度学习：比较研究

论文标题

仇恨言论检测的深度学习：比较研究

Deep Learning for Hate Speech Detection: A Comparative Study

论文作者

Malik, Jitendra Singh, Qiao, Hezhe, Pang, Guansong, Hengel, Anton van den

论文摘要

自动仇恨言论检测是打击仇恨言论的传播，尤其是在社交媒体中的重要工具。为该任务开发了许多方法，包括最近基于深度学习的方法的扩散。还开发了各种数据集，体现了仇恨语音检测问题的各种表现。我们在这里提出了深层和浅层仇恨语音检测方法的大规模经验比较，该方法是通过三个最常用的数据集介导的。我们的目标是阐明该地区的进步，并确定当前最新的优势和劣势。我们特别将分析重点放在实践绩效的度量上，包括检测准确性，计算效率，使用预训练模型的能力以及域的概括。在这样做的过程中，我们旨在为在实践中使用仇恨语音检测，量化最新技术并确定未来的研究方向提供指导。代码和数据集可在https://github.com/jmjmalik22/hate-speech-detection上找到。

Automated hate speech detection is an important tool in combating the spread of hate speech, particularly in social media. Numerous methods have been developed for the task, including a recent proliferation of deep-learning based approaches. A variety of datasets have also been developed, exemplifying various manifestations of the hate-speech detection problem. We present here a large-scale empirical comparison of deep and shallow hate-speech detection methods, mediated through the three most commonly used datasets. Our goal is to illuminate progress in the area, and identify strengths and weaknesses in the current state-of-the-art. We particularly focus our analysis on measures of practical performance, including detection accuracy, computational efficiency, capability in using pre-trained models, and domain generalization. In doing so we aim to provide guidance as to the use of hate-speech detection in practice, quantify the state-of-the-art, and identify future research directions. Code and dataset are available at https://github.com/jmjmalik22/Hate-Speech-Detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题