论文标题
在多语言模型中分析性别表征
Analyzing Gender Representation in Multilingual Models
论文作者
论文摘要
表明多语言语言模型允许跨脚本和语言进行非平凡的转移。在这项工作中,我们研究了能够转移的内部表示的结构。我们将重点放在性别区分作为实际案例研究的表示上,并研究在跨不同语言的共享子空间中编码性别概念的程度。我们的分析表明,性别表示由几个跨语言共享的重要组成部分以及特定于语言的组成部分组成。与语言无关和特定语言的组成部分的存在为我们做出的有趣的经验观察提供了解释:虽然性别分类跨语言良好地传递了跨语言,对性别删除的干预措施,对单一语言进行了培训,但不会轻易转移给其他人。
Multilingual language models were shown to allow for nontrivial transfer across scripts and languages. In this work, we study the structure of the internal representations that enable this transfer. We focus on the representation of gender distinctions as a practical case study, and examine the extent to which the gender concept is encoded in shared subspaces across different languages. Our analysis shows that gender representations consist of several prominent components that are shared across languages, alongside language-specific components. The existence of language-independent and language-specific components provides an explanation for an intriguing empirical observation we make: while gender classification transfers well across languages, interventions for gender removal, trained on a single language, do not transfer easily to others.