巴西葡萄牙社交媒体中的有毒语言检测：新数据集和多语言分析

论文标题

巴西葡萄牙社交媒体中的有毒语言检测：新数据集和多语言分析

Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis

论文作者

Leite, João A., Silva, Diego F., Bontcheva, Kalina, Scarton, Carolina

论文摘要

仇恨言论和有毒评论是社交媒体平台用户的普遍关注点。尽管幸运的是，这些评论是这些平台中的少数群体，但它们仍然能够造成伤害。因此，确定这些评论是研究和防止社交媒体中毒性扩散的重要任务。以前的自动检测有毒评论的工作主要集中在英语中，在巴西葡萄牙语等语言中很少有工作。在本文中，我们为巴西葡萄牙语提出了一个新的大规模数据集，其推文注释为有毒或无毒或不同类型的毒性。我们介绍数据集收集和注释过程，我们旨在选择涵盖多个人口组的候选人。最先进的BERT模型能够在二进制案例中使用单语言数据获得76％的宏观F1得分。我们还表明，尽管多语种方法的进步最近取得了进步，但仍需要大规模的单语言数据来创建更准确的模型。多标签分类的错误分析和实验表明，在我们的数据中频率较低的某些类型的有毒评论进行分类的困难，并强调了开发了解不同类别毒性类别的模型的需求。

Hate speech and toxic comments are a common concern of social media platform users. Although these comments are, fortunately, the minority in these platforms, they are still capable of causing harm. Therefore, identifying these comments is an important task for studying and preventing the proliferation of toxicity in social media. Previous work in automatically detecting toxic comments focus mainly in English, with very few work in languages like Brazilian Portuguese. In this paper, we propose a new large-scale dataset for Brazilian Portuguese with tweets annotated as either toxic or non-toxic or in different types of toxicity. We present our dataset collection and annotation process, where we aimed to select candidates covering multiple demographic groups. State-of-the-art BERT models were able to achieve 76% macro-F1 score using monolingual data in the binary case. We also show that large-scale monolingual data is still needed to create more accurate models, despite recent advances in multilingual approaches. An error analysis and experiments with multi-label classification show the difficulty of classifying certain types of toxic comments that appear less frequently in our data and highlights the need to develop models that are aware of different categories of toxicity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题