语言分支知识蒸馏的跨语性机器阅读理解

论文标题

语言分支知识蒸馏的跨语性机器阅读理解

Cross-lingual Machine Reading Comprehension with Language Branch Knowledge Distillation

论文作者

Liu, Junhao, Shou, Linjun, Pei, Jian, Gong, Ming, Yang, Min, Jiang, Daxin

论文摘要

跨语性的机器阅读理解（CLMRC）仍然是一个具有挑战性的问题，因为缺乏低源说明语言的大规模注释数据集，例如阿拉伯语，印地语和越南语。许多以前的方法通过将富源语言（例如英语）翻译成低源语言来使用翻译数据，作为辅助监督。但是，如何有效利用翻译数据并减少翻译引入的噪声的影响仍然很繁重。在本文中，我们通过一种名为语言分支机构阅读理解（LBMRC）的新型增强方法来应对这一挑战并增强了跨语义的转移性能。语言分支是一种单一语言的一组段落，并配以所有目标语言中的问题。我们训练多个机器阅读理解（MRC）模型，基于LBMRC熟练了单个语言。然后，我们设计了一种多语言蒸馏方法，用于将多种语言分支模型的合并知识与所有目标语言的单个模型进行合并。将LBMRC和多语言蒸馏组合起来可以更适合数据噪声，因此可以提高模型的跨语性能力。同时，生产的单语言模型适用于所有目标语言，从而节省了多种模型的培训，推理和维护成本。对两个CLMRC基准测试的广泛实验清楚地表明了我们提出的方法的有效性。

Cross-lingual Machine Reading Comprehension (CLMRC) remains a challenging problem due to the lack of large-scale annotated datasets in low-source languages, such as Arabic, Hindi, and Vietnamese. Many previous approaches use translation data by translating from a rich-source language, such as English, to low-source languages as auxiliary supervision. However, how to effectively leverage translation data and reduce the impact of noise introduced by translation remains onerous. In this paper, we tackle this challenge and enhance the cross-lingual transferring performance by a novel augmentation approach named Language Branch Machine Reading Comprehension (LBMRC). A language branch is a group of passages in one single language paired with questions in all target languages. We train multiple machine reading comprehension (MRC) models proficient in individual language based on LBMRC. Then, we devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages. Combining the LBMRC and multilingual distillation can be more robust to the data noises, therefore, improving the model's cross-lingual ability. Meanwhile, the produced single multilingual model is applicable to all target languages, which saves the cost of training, inference, and maintenance for multiple models. Extensive experiments on two CLMRC benchmarks clearly show the effectiveness of our proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题