低资源神经机器翻译的语言驱动的多任务预训练

论文标题

低资源神经机器翻译的语言驱动的多任务预训练

Linguistically-driven Multi-task Pre-training for Low-resource Neural Machine Translation

论文作者

Mao, Zhuoyuan, Chu, Chenhui, Kurohashi, Sadao

论文摘要

在本研究中，我们提出了针对低资源机器翻译（NMT）的新序列到序列的训练预训练目标：日语特定序列到序列（JASS）的语言对序列（JASS），涉及日语作为源或目标语言，而英文特定的序列以及涉及英语的语言对的序列（ENSS）。 Jass专注于掩盖和重新排序日本语言单元，称为Bunsetsu，而ENSS是基于短语结构掩盖和重新排序任务提出的。 Experiments on ASPEC Japanese--English & Japanese--Chinese, Wikipedia Japanese--Chinese, News English--Korean corpora demonstrate that JASS and ENSS outperform MASS and other existing language-agnostic pre-training methods by up to +2.9 BLEU points for the Japanese--English tasks, up to +7.0 BLEU points for the Japanese--Chinese tasks and up to +1.3 BLEU points for英语 - Korean任务。重点关注JASS和ENS的各个部分之间关系的经验分析揭示了Jass和Ens的子任务的互补性。使用激光，人类评估和案例研究的充分性评估表明，我们所提出的方法在没有注入语言知识的情况下明显胜过训练的方法，并且与流畅性相比，它们对适当性的积极影响更大。我们在此处发布代码：https：//github.com/mao-ku/jass/tree/master/master/linguistaly-driven-pretraining。

In the present study, we propose novel sequence-to-sequence pre-training objectives for low-resource machine translation (NMT): Japanese-specific sequence to sequence (JASS) for language pairs involving Japanese as the source or target language, and English-specific sequence to sequence (ENSS) for language pairs involving English. JASS focuses on masking and reordering Japanese linguistic units known as bunsetsu, whereas ENSS is proposed based on phrase structure masking and reordering tasks. Experiments on ASPEC Japanese--English & Japanese--Chinese, Wikipedia Japanese--Chinese, News English--Korean corpora demonstrate that JASS and ENSS outperform MASS and other existing language-agnostic pre-training methods by up to +2.9 BLEU points for the Japanese--English tasks, up to +7.0 BLEU points for the Japanese--Chinese tasks and up to +1.3 BLEU points for English--Korean tasks. Empirical analysis, which focuses on the relationship between individual parts in JASS and ENSS, reveals the complementary nature of the subtasks of JASS and ENSS. Adequacy evaluation using LASER, human evaluation, and case studies reveals that our proposed methods significantly outperform pre-training methods without injected linguistic knowledge and they have a larger positive impact on the adequacy as compared to the fluency. We release codes here: https://github.com/Mao-KU/JASS/tree/master/linguistically-driven-pretraining.

下载PDF全文

下载文献需遵守相关版权规定

论文标题