论文标题
XLM-T:使用审慎的跨语性变压器编码器扩展多语言机器翻译
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
论文作者
论文摘要
多语言机器翻译使单个模型能够在不同语言之间翻译。大多数现有的多语言机器翻译系统采用随机初始化的变压器主链。在这项工作中,受语言模型预训练的最新成功的启发,我们提出了XLM-T,该XLM-T使用了现成的跨语言变压器编码器来初始化模型,并使用多语言并行数据对其进行微调。这种简单的方法可以在具有10对语言对的WMT数据集上进行了重大改进,并且具有94对的Opus-100语料库。令人惊讶的是,即使在反向翻译的强基线下,该方法也是有效的。此外,对XLM-T对无监督的句法解析,单词一致性和多语言分类的广泛分析解释了其对机器翻译的有效性。该代码将在https://aka.ms/xlm-t上。
Multilingual machine translation enables a single model to translate between different languages. Most existing multilingual machine translation systems adopt a randomly initialized Transformer backbone. In this work, inspired by the recent success of language model pre-training, we present XLM-T, which initializes the model with an off-the-shelf pretrained cross-lingual Transformer encoder and fine-tunes it with multilingual parallel data. This simple method achieves significant improvements on a WMT dataset with 10 language pairs and the OPUS-100 corpus with 94 pairs. Surprisingly, the method is also effective even upon the strong baseline with back-translation. Moreover, extensive analysis of XLM-T on unsupervised syntactic parsing, word alignment, and multilingual classification explains its effectiveness for machine translation. The code will be at https://aka.ms/xlm-t.