通过转移学习研究临床领域的大规模多语言预训练的机器翻译模型

论文标题

通过转移学习研究临床领域的大规模多语言预训练的机器翻译模型

Investigating Massive Multilingual Pre-Trained Machine Translation Models for Clinical Domain via Transfer Learning

论文作者

Han, Lifeng, Erofeev, Gleb, Sorokina, Irina, Gladkoff, Serge, Nenadic, Goran

论文摘要

近年来，开发了大量多语言的预训练语言模型（MMPLM），这些模型证明了超级大国以及他们为下游任务获得的预知。这项工作调查了MMPLM是否可以通过转移学习将MMPLM应用于临床域机器翻译（MT）向完全看不见的语言。 We carry out an experimental investigation using Meta-AI's MMPLMs ``wmt21-dense-24-wide-en-X and X-en (WMT21fb)'' which were pre-trained on 7 language pairs and 14 translation directions including English to Czech, German, Hausa, Icelandic, Japanese, Russian, and Chinese, and the opposite direction.我们将这些mmplms调整为英语 - \ textit {西班牙语}语言对，\ textit {根本不存在其原始的预培训的Corpora中，都隐含和显式。我们为此微调准备了仔细对齐的\ textit {临床}域数据，这与它们的原始混合域知识不同。我们的实验结果表明，仅使用250k良好的内域内部段进行微调，用于三个子任务转换测试：临床病例，临床术语和本体论概念。它的评估得分非常紧密，来自Meta-ai的另一个MMPLM NLLB，其中包括西班牙语作为预训练中的高资源环境。据我们所知，这是对\ textit {临床域转移学习NMT}使用MMPLM的第一项工作，用于在预训练期间完全看不见的语言。

Massively multilingual pre-trained language models (MMPLMs) are developed in recent years demonstrating superpowers and the pre-knowledge they acquire for downstream tasks. This work investigates whether MMPLMs can be applied to clinical domain machine translation (MT) towards entirely unseen languages via transfer learning. We carry out an experimental investigation using Meta-AI's MMPLMs ``wmt21-dense-24-wide-en-X and X-en (WMT21fb)'' which were pre-trained on 7 language pairs and 14 translation directions including English to Czech, German, Hausa, Icelandic, Japanese, Russian, and Chinese, and the opposite direction. We fine-tune these MMPLMs towards English-\textit{Spanish} language pair which \textit{did not exist at all} in their original pre-trained corpora both implicitly and explicitly. We prepare carefully aligned \textit{clinical} domain data for this fine-tuning, which is different from their original mixed domain knowledge. Our experimental result shows that the fine-tuning is very successful using just 250k well-aligned in-domain EN-ES segments for three sub-task translation testings: clinical cases, clinical terms, and ontology concepts. It achieves very close evaluation scores to another MMPLM NLLB from Meta-AI, which included Spanish as a high-resource setting in the pre-training. To the best of our knowledge, this is the first work on using MMPLMs towards \textit{clinical domain transfer-learning NMT} successfully for totally unseen languages during pre-training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题