论文标题
通过同理数据来减轻毒性变性:探索毒性与移情之间的关系
Mitigating Toxic Degeneration with Empathetic Data: Exploring the Relationship Between Toxicity and Empathy
论文作者
论文摘要
大型的预训练的神经语言模型支持了许多NLP任务的有效性,但仍然容易产生有毒语言阻碍其使用的安全性。使用善解人意的数据,我们改进了最新的可控文本生成工作,旨在减少生成的文本的毒性。我们发现我们能够将微调数据的大小大大减少到7.5-30k样本,同时通过基于促进性采样数据,基于同步分数来进行策略性采样数据,从而,比最高的230万个样本中的原始工作中最高3.4%的绝对减少(相对26%相对)的最高最高3.4%的绝对减少(相对26%)的毒性得到了重大改进。我们观察到,改善程度受到同理心的特定交流组成部分的约束。特别是,同理心的认知成分在几乎所有实验中都显着击败了原始数据集,而情绪同理心则与较小的改进相关,甚至不足的原始数据随机样本不足。对于NLP工作,这是关于同理心的一个特别的见解,直到最近,为其建立的研究和资源仅将同理心视为一种情感概念。
Large pre-trained neural language models have supported the effectiveness of many NLP tasks, yet are still prone to generating toxic language hindering the safety of their use. Using empathetic data, we improve over recent work on controllable text generation that aims to reduce the toxicity of generated text. We find we are able to dramatically reduce the size of fine-tuning data to 7.5-30k samples while at the same time making significant improvements over state-of-the-art toxicity mitigation of up to 3.4% absolute reduction (26% relative) from the original work on 2.3m samples, by strategically sampling data based on empathy scores. We observe that the degree of improvement is subject to specific communication components of empathy. In particular, the cognitive components of empathy significantly beat the original dataset in almost all experiments, while emotional empathy was tied to less improvement and even underperforming random samples of the original data. This is a particularly implicative insight for NLP work concerning empathy as until recently the research and resources built for it have exclusively considered empathy as an emotional concept.