论文标题

可伸缩的交叉舌枢轴,以建模代词性别进行翻译

Scalable Cross Lingual Pivots to Model Pronoun Gender for Translation

论文作者

Webster, Kellie, Pitler, Emily

论文摘要

当将掉落或中性代词翻译成具有性别代词(例如英语)的语言时,具有不足文档理解的机器翻译系统可能会造成错误。很难预测这些代词的基本性别,因为它没有在文本上标记,而必须从上下文中的核心提及中推断出来。我们提出了一种新型的跨语言枢纽技术,用于自动产生高质量的性别标签,并表明该数据可用于微调BERT分类器,用于西班牙掉落的女性代词为92%F1,而神经机器翻译模型为30-51%,而对于非最新机器的bert模型,则使用了30-51%。我们增强了带有分类器的标签的神经机器翻译模型,以改善代词翻译,同时仍然具有可行的翻译模型,一次翻译句子。

Machine translation systems with inadequate document understanding can make errors when translating dropped or neutral pronouns into languages with gendered pronouns (e.g., English). Predicting the underlying gender of these pronouns is difficult since it is not marked textually and must instead be inferred from coreferent mentions in the context. We propose a novel cross-lingual pivoting technique for automatically producing high-quality gender labels, and show that this data can be used to fine-tune a BERT classifier with 92% F1 for Spanish dropped feminine pronouns, compared with 30-51% for neural machine translation models and 54-71% for a non-fine-tuned BERT model. We augment a neural machine translation model with labels from our classifier to improve pronoun translation, while still having parallelizable translation models that translate a sentence at a time.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源