变压器会太组成吗？分析神经机器翻译中的成语处理

论文标题

变压器会太组成吗？分析神经机器翻译中的成语处理

Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation

论文作者

Dankers, Verna, Lucas, Christopher G., Titov, Ivan

论文摘要

与文字表达式不同，成语的含义不会直接从其部分遵循，这对神经机器翻译（NMT）构成了挑战。 NMT模型通常无法准确地翻译成语，并且过度生成的构图，字面翻译。在这项工作中，我们通过分析具有英语作为源语言的模型的隐藏状态和七种欧洲语言之一，是通过分析具有英语模型的隐藏状态和关注模式，研究了成语的非构成性。当变压器发出非文字翻译时（即将表达式识别为惯用性），与文字表达相比，编码器过程更强烈地为单个词汇单元。这表现在成语中的部分是通过注意力分组的，并减少了成语与其上下文之间的相互作用。在解码器的跨注意事项中，象征性输入导致对源侧令牌的注意力降低。这些结果表明，作为组成表达式，变压器处理成语的趋势有助于成语的字面翻译。

Unlike literal expressions, idioms' meanings do not directly follow from their parts, posing a challenge for neural machine translation (NMT). NMT models are often unable to translate idioms accurately and over-generate compositional, literal translations. In this work, we investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer, by analysing the hidden states and attention patterns for models with English as source language and one of seven European languages as target language. When Transformer emits a non-literal translation - i.e. identifies the expression as idiomatic - the encoder processes idioms more strongly as single lexical units compared to literal expressions. This manifests in idioms' parts being grouped through attention and in reduced interaction between idioms and their context. In the decoder's cross-attention, figurative inputs result in reduced attention on source-side tokens. These results suggest that Transformer's tendency to process idioms as compositional expressions contributes to literal translations of idioms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题