语言各向异性跨语言模型编辑

论文标题

语言各向异性跨语言模型编辑

Language Anisotropic Cross-Lingual Model Editing

论文作者

Xu, Yang, Hou, Yutai, Che, Wanxiang, Zhang, Min

论文摘要

多语言预训练的语言模型可以学习特定于任务的能力或记住跨多种语言的事实，但不可避免地会通过特定输入做出不希望的预测。在类似的观察结果下，模型编辑的目的是在事后校准一个针对特定输入的模型，以保持模型的原始行为。但是，现有工作仅研究单语言的情况，该场景缺乏跨语言同时进行编辑的跨语性转移性。在这项工作中，我们专注于跨语性模型编辑。首先，我们定义了跨语言模型编辑任务和相应的指标，其中一种语言的编辑向另一种语言传播。接下来，我们提出了一个框架，可以自然地将单语模型编辑方法适应使用平行语料库的跨语性场景。此外，我们建议语言各向异性编辑，以通过扩增每种语言的不同参数子集来改善跨语性编辑。在新定义的跨语性模型编辑任务上，我们从经验上证明了单语基线在传播编辑到多种语言以及提议的语言各向异性模型编辑的有效性方面的失败。我们的代码可在https://github.com/franklear/lime上公开获取。

Multilingual pre-trained language models can learn task-specific abilities or memorize facts across multiple languages but inevitably make undesired predictions with specific inputs. Under similar observation, model editing aims to post-hoc calibrate a model targeted to specific inputs with keeping the model's raw behavior. However, existing work only studies the monolingual scenario, which lacks the cross-lingual transferability to perform editing simultaneously across languages. In this work, we focus on cross-lingual model editing. Firstly, we define the cross-lingual model editing task and corresponding metrics, where an edit in one language propagates to the others. Next, we propose a framework to naturally adapt monolingual model editing approaches to the cross-lingual scenario using parallel corpus. Further, we propose language anisotropic editing to improve cross-lingual editing by amplifying different subsets of parameters for each language. On the newly defined cross-lingual model editing task, we empirically demonstrate the failure of monolingual baselines in propagating the edit to multiple languages and the effectiveness of the proposed language anisotropic model editing. Our code is publicly available at https://github.com/franklear/LiME.

下载PDF全文

下载文献需遵守相关版权规定

论文标题