论文标题
性别偏见在蒙版语言模型的预训练阶段意外波动
Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked Language Models
论文作者
论文摘要
蒙面的语言模型在预训练期间会遇到性别偏见。这种偏见通常归因于某个模型架构及其前训练的语料库,并具有隐含的假设,即预训练过程中的其他变化(例如随机种子的选择或停止点的选择)对所测量的偏见没有影响。但是,我们表明,在单个模板的基本水平上存在严重的波动,使假设无效。进一步反对人类如何获得偏见的直觉,这些波动与预测代词或培训前语料库中的专业频率的确定性无关。我们发布我们的代码和数据以使未来的研究受益。
Masked language models pick up gender biases during pre-training. Such biases are usually attributed to a certain model architecture and its pre-training corpora, with the implicit assumption that other variations in the pre-training process, such as the choices of the random seed or the stopping point, have no effect on the biases measured. However, we show that severe fluctuations exist at the fundamental level of individual templates, invalidating the assumption. Further against the intuition of how humans acquire biases, these fluctuations are not correlated with the certainty of the predicted pronouns or the profession frequencies in pre-training corpora. We release our code and data to benefit future research.