论文标题
Erzya语言的第一个神经机器翻译系统
The first neural machine translation system for the Erzya language
论文作者
论文摘要
我们介绍了第一个用于濒临灭绝的Erzya语言和俄语之间的神经机器翻译系统以及我们为训练和评估它所收集的数据集。 BLEU分别分别为Erzya和俄罗斯人的BLEU分数为17和19,其中一半以上的翻译被以母语为母语的人可以接受。我们还调整了模型以在Erzya和其他10种语言之间翻译,但是如果没有其他并行数据,这些方向上的质量仍然很低。我们将翻译模型与收集的文本语料库,新的语言标识模型以及适合Erzya语言的多语言句子编码器一起发布。这些资源将在https://github.com/slone-nlp/myv-nmt上找到。
We present the first neural machine translation system for translation between the endangered Erzya language and Russian and the dataset collected by us to train and evaluate it. The BLEU scores are 17 and 19 for translation to Erzya and Russian respectively, and more than half of the translations are rated as acceptable by native speakers. We also adapt our model to translate between Erzya and 10 other languages, but without additional parallel data, the quality on these directions remains low. We release the translation models along with the collected text corpus, a new language identification model, and a multilingual sentence encoder adapted for the Erzya language. These resources will be available at https://github.com/slone-nlp/myv-nmt.